[comp.unix.ultrix] SYSTEM logins weird

hurf@batcomputer.tn.cornell.edu (Hurf Sheldon) (01/12/90)

uVaxII, Ultrix3.1
	
	We just had to shutdown and reboot a system to switch some disks.
We had to halt the thing during one of the reboots. Ran fsck on /dev/ra0a after booting up single user - it made some minor repairs on one inode.
After that the
logins to the hardware ttys aren't there even though the gettys are running.
Additionally the login prompt nolonger displays the hostname: just 'login'

The network ports and the console and X term logins all work but the dhv ports
have gone south. I checked /etc/ttys /dev,  - can't see any thing wrong - 
a 'cat /etc/ttys >/dev/tty08' gets nothing out tty08...
Our dhv is a dilog 1620 16 port board.

Uerf doesn't report any errors...

I put a new kernal on and rebooted - no fix.

suggestions?
hurf


-- 
     Hurf Sheldon			 Network: hurf@ionvax.tn.cornell.edu
     Lab of Plasma Studies		  Bitnet: hurf@CRNLION
     369 Upson Hall, Cornell University, Ithaca, N.Y. 14853  ph:607 255 7267
     "And the walls came tumbling down"

grr@cbmvax.commodore.com (George Robbins) (01/12/90)

In article <9527@batcomputer.tn.cornell.edu> hurf@tcgould.tn.cornell.edu (Hurf Sheldon) writes:
> uVaxII, Ultrix3.1
> 	
> 	We just had to shutdown and reboot a system to switch some disks.
> We had to halt the thing during one of the reboots. Ran fsck on /dev/ra0a after booting up single user - it made some minor repairs on one inode.
> After that the
> logins to the hardware ttys aren't there even though the gettys are running.
> Additionally the login prompt nolonger displays the hostname: just 'login'
> 
> The network ports and the console and X term logins all work but the dhv ports
> have gone south. I checked /etc/ttys /dev,  - can't see any thing wrong - 
> a 'cat /etc/ttys >/dev/tty08' gets nothing out tty08...
> Our dhv is a dilog 1620 16 port board.

Pretty spooky!

Check that the /dev entries still show up as "character" type special files
with rational major and minor device numbers, and also reasonable ownership/
protections.

Check that /etc/ttys and /etc/gettytab haven't been "improved".

Look at /usr/spool/mqueue/syslog (or wherever you have syslog going) - this
is where init/getty/login report their problems.

Finally if you have hardcopy take the inode number and use "dcheck -i" to
see what it was.  See if anything interesting has shown up in lost+found.

-- 
George Robbins - now working for,	uucp: {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@uunet.uu.net
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)

hurf@batcomputer.tn.cornell.edu (Hurf Sheldon) (01/12/90)

	Posting my own followup:
	I found that /etc/gettytab had been
	rewritten with the top 3690 bytes of /etc/passwd - this
	is the second time under Ultrix3.1 on 2 different systems
	this has happened - ?????

hurf
-- 
     Hurf Sheldon			 Network: hurf@ionvax.tn.cornell.edu
     Lab of Plasma Studies		  Bitnet: hurf@CRNLION
     369 Upson Hall, Cornell University, Ithaca, N.Y. 14853  ph:607 255 7267
     "And the walls came tumbling down"

envbvs@epb2.lbl.gov (Brian V. Smith) (01/12/90)

In article <9533@batcomputer.tn.cornell.edu>,
hurf@batcomputer.tn.cornell.edu (Hurf Sheldon) writes:
< 
< 	Posting my own followup:
< 	I found that /etc/gettytab had been
< 	rewritten with the top 3690 bytes of /etc/passwd - this
< 	is the second time under Ultrix3.1 on 2 different systems
< 	this has happened - ?????
< 

We had the same problem in Ultrix 3.0.
At first I thought I had rcp'd the file on top of itself which is guaranteed
to clobber the file in a similar way, but then I realized that
I hadn't changed /etc/gettytab (ever), so I hadn't run rcp at all on it.

It happened on two of our six machines.
Any ideas?
____________________________________
Brian V. Smith    (bvsmith@lbl.gov)
Lawrence Berkeley Laboratory
I don't speak for LBL, these non-opinions are all mine.

steven@pacific.csl.uiuc.edu (Steven Parkes) (01/12/90)

In article <4614@helios.ee.lbl.gov>, envbvs@epb2.lbl.gov (Brian V.
Smith) writes:
> From: envbvs@epb2.lbl.gov (Brian V. Smith)
> Newsgroups: comp.unix.ultrix
> Subject: Re: SYSTEM logins weird
> 
> < 	I found that /etc/gettytab had been
> < 	rewritten ....
> We had the same problem in Ultrix 3.0. ...

We've had a simillar problem on 3500's under 3.1 ... it wasn't passwd
but part of /usr/spool/mail/*.  Sounds like more than a fluke ...

grr@cbmvax.commodore.com (George Robbins) (01/12/90)

In article <4614@helios.ee.lbl.gov> envbvs@epb2.lbl.gov (Brian V. Smith) writes:
> In article <9533@batcomputer.tn.cornell.edu>,
> hurf@batcomputer.tn.cornell.edu (Hurf Sheldon) writes:
> < 
> < 	I found that /etc/gettytab had been
> < 	rewritten with the top 3690 bytes of /etc/passwd - this
> < 	is the second time under Ultrix3.1 on 2 different systems
> < 	this has happened - ?????
> 
> We had the same problem in Ultrix 3.0.
> At first I thought I had rcp'd the file on top of itself which is guaranteed
> to clobber the file in a similar way, but then I realized that
> I hadn't changed /etc/gettytab (ever), so I hadn't run rcp at all on it.
> 
> Any ideas?

Remember that the startup script edits the /etc/gettytab and /etc/ttys
files to update the system name.  I suspect this kind of thing might be
happening if your system croaks without haveing written out the new version.
On the other hand, unless it died withing a few moments after startup, it
*should* have written the stuff from the buffer pool...

-- 
George Robbins - now working for,	uucp: {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@uunet.uu.net
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)

greg@duke.cs.unlv.edu (Greg Wohletz) (01/13/90)

In article <9533@batcomputer.tn.cornell.edu>,
hurf@batcomputer.tn.cornell.edu (Hurf Sheldon) writes:
< 
< 	Posting my own followup:
< 	I found that /etc/gettytab had been
< 	rewritten with the top 3690 bytes of /etc/passwd - this
< 	is the second time under Ultrix3.1 on 2 different systems
< 	this has happened - ?????
< 



hmm, this has happened several times to me as well.  Ultrix 3.1 has
been quite a headache ever since I upgraded from 2.0.  I've posted two
messages about various problems I have had and recieved NO replies
whatsoever.  Doesn't anyone from the ultrix group read this newsgroup?

    	    	    	    	    	--Greg

avolio@decuac.dec.com (Frederick M. Avolio) (01/13/90)

In article <1453@jimi.cs.unlv.edu> greg@duke.cs.unlv.edu (Greg Wohletz) writes:
>hmm, this has happened several times to me as well.  Ultrix 3.1 has
>been quite a headache ever since I upgraded from 2.0.  I've posted two
>messages about various problems I have had and recieved NO replies
>whatsoever.  Doesn't anyone from the ultrix group read this newsgroup?
>
>    	    	    	    	    	--Greg


Well, send me your two problems and/or repost them.  Yes, people from
the ULTRIX group read this (and no, I am not in the ULTRIX group) but
the best bet to get fixes is still through buying support.  I guess
that sounds sort of capitalistic, eh? :-).

The problems Hurf is having sounds like file system problems to me...
But I am hard-pressed to blame it on software since very few systems
(I now know of two) have had these problems.  

Fred

pat@orac.pgh.pa.us (Pat Barron) (01/13/90)

In article <9533@batcomputer.tn.cornell.edu> hurf@tcgould.tn.cornell.edu (Hurf Sheldon) writes:
>	Posting my own followup:
>	I found that /etc/gettytab had been
>	rewritten with the top 3690 bytes of /etc/passwd - this
>	is the second time under Ultrix3.1 on 2 different systems
>	this has happened - ?????

I've seen this a lot, too.  I've always suspected that it had something
to do with the fact that /etc/gettytab is edited in /etc/rc or rc.local
(to put the current system version string into the default login banner)
but have never been able to prove anything....

--Pat.
-- 
Pat Barron
Internet:  pat@orac.pgh.pa.us  - or -   orac!pat@gateway.sei.cmu.edu
UUCP:  ...!uunet!apexepa!sei!orac!pat  - or -  ...!pitt!darth!orac!pat

greg@duke.cs.unlv.edu (Greg Wohletz) (01/14/90)

In article <2877@decuac.DEC.COM>, avolio@decuac.dec.com (Frederick M.
Avolio) writes:
> 
> Well, send me your two problems and/or repost them.  Yes, people from
> the ULTRIX group read this (and no, I am not in the ULTRIX group) but
> the best bet to get fixes is still through buying support.  I guess
> that sounds sort of capitalistic, eh? :-).

Sorry for the flame, I've had a bad week...  Anyway.  We actually
tried to order source code maintenance with our tape, but somehow the
order got screwed up.  Here is my last message again.


In article <1444@jimi.cs.unlv.edu>, I write:

> We have three microvax  II's that we  use as fileservers.  Each has  3
> Wren V's and   an Exabyte hooked  into  a  Sigma  scsi  controller  (it
> emulates  a  uda  and  tms controller).   They    also  have a Dec  uda
> controller hooked to  an  rd52 and two rx50's  (yes  we've had  these
> machines for  a while...) on  them.   We have been  running with  this
> configuration under Ultrix  2.0 without  many problems (well a few nfs
> bugs, but nothing major).  Recently we got Ultrix 3.1.  I installed it
> on one of  our microvax's  and everything seemed  to be  going fine, I
> could use the disks, and read from the Exabyte.  However, when I tried
> to dump the root filesystem  to the Exabyte  I got a write error, then
> some message like ``mscp resynching controller uq2'' at that point the
> system locked up.

Well,  I've  investigated the situation   a bit further,   and  I have
discovered that (surprise, surprise)  one difference  between  2.0 and
3.1 is that  all of the disk and  tape drive stuff  appears to have be
re-written.   Now  everything  (except  for  non-uda   type drives and
non-tmscp tapes) goes through this new mscp code (or at least  that is
what it looks  like to me).  Anyway looking  at the code didn't reveal
anything obvious,  however  I have noticed  that I  can't  dump to the
trusty (?) old rx50's.   The first volume  of the dump works fine, but
if you so  much as open the  door to the floppy  when dump asks you to
insert the next volume all subsequent  attempts to write to the floppy
will fail (if you leave the same floppy in (without opening  the drive
door) for  ALL of the volumes it  will work...).   I suspect that this
problem is related to the same bug.  I think  at this point I'm almost
convinced that it is a software bug, and not  a problem with the Sigma
controller, but I could be wrong.

So  the question  is, will  someone  from DEC tell  me if  there  is a
known/fixable bug in 3.1 that would cause this behavior?

Would if be possible to graft in the old tmscp  code  from 2.0 without
an inordinate amount of pain?

ANY information would be greatly appreciated.

                                        --Greg
                                        greg@unlv.edu
                                        <@relay.cs.net:greg@unlv.edu>

brw@hertz.njit.edu (Brian White) (01/15/90)

In article <2877@decuac.DEC.COM> avolio@decuac.dec.com writes:
>
>Well, send me your two problems and/or repost them.  Yes, people from
>the ULTRIX group read this (and no, I am not in the ULTRIX group) but
>the best bet to get fixes is still through buying support.  I guess

We have it; I'm not convinced it's worthwhile, though...

>that sounds sort of capitalistic, eh? :-).
>
>The problems Hurf is having sounds like file system problems to me...
>But I am hard-pressed to blame it on software since very few systems
>(I now know of two) have had these problems.  
>
>Fred

Uhhh, make that three, Fred. Our 11/785 (running 3.1) crashed and a 
*News article* showed up in place of gettytab. No login message, no baud rate
set, and 8 lines of vi hipped me to the problem, but that sure was weird....

Brian White
System Programmer
New Jersey Institute of Technology
brw@hertz.njit.edu

Brian White
System Programmer
New Jersey Institute of Technology
brw@hertz.njit.edu brw@njit.edu brw@jazz.njit.edu (yeah, I know, showoff....)

grr@cbmvax.commodore.com (George Robbins) (01/15/90)

In article <973@njitgw.njit.edu> brw@hertz.njit.edu (Brian White) writes:
> In article <2877@decuac.DEC.COM> avolio@decuac.dec.com writes:
> >
> >The problems Hurf is having sounds like file system problems to me...
> >But I am hard-pressed to blame it on software since very few systems
> >(I now know of two) have had these problems.  
> >
> >Fred
> 
> Uhhh, make that three, Fred. Our 11/785 (running 3.1) crashed and a 
> *News article* showed up in place of gettytab. No login message, no baud rate
> set, and 8 lines of vi hipped me to the problem, but that sure was weird....

I guess a few well placed "sync" commands in the /etc/rc.local file should
cure this symptom, however it's real perverso that this file should not
get flushed and forgotten, especially of the system doesn't crash immediately.
After all, /etc/update should do the trick within 30 seconds.

I haven't seen this particular disease, but I have occasionaly found
completely trash files in my uucp queues (I know most uucp is trash, but
these are obviously not the right kind of trash for where they show up 8-).

-- 
George Robbins - now working for,	uucp: {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@uunet.uu.net
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)