[comp.unix.wizards] wtmp problem: garbage in /usr/adm/wtmp

chris@mimsy.UUCP (Chris Torek) (05/05/87)

In article <469@uhccux.UUCP> todd@uhccux.UUCP (The Perplexed Wiz) writes:
>... my wtmp file was getting data ok and then it looks like some
>garbage got thrown into it causing an extra field to be inserted
>between the last "good" structure field and the next entry into wtmp.

This occurs when the file system fills up.  `struct utmp' is 36
bytes long in stock 4.3BSD.  It was 20 bytes long in 4.1BSD, and
may have been 28 bytes in 4.2BSD.  None of these are even divisors
of any file system block size, so when the system runs out of
blocks, a write may succeed partially, leaving a bit of a utmp
entry behind to confuse things.

(If O_APPEND mode is not broken, no write will be split across
two blocks by any I/O wait, as the inode is locked across each
call to bmap [/sys/sys/sys_inode.c$ino_rw()].)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

todd@uhccux.UUCP (The Perplexed Wiz) (05/06/87)

In article <6554@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <469@uhccux.UUCP> todd@uhccux.UUCP (The Perplexed Wiz) writes:
>>... my wtmp file was getting data ok and then it looks like some
>>garbage got thrown into it causing an extra field to be inserted
>This occurs when the file system fills up.  `struct utmp' is 36
>bytes long in stock 4.3BSD.  It was 20 bytes long in 4.1BSD, and

Hmm....That sounds right but my file systems are ok in terms of
space availability.  See below:

Filesystem    total    kbytes  kbytes  percent
   node       kbytes    used    free   used    Mounted on
/dev/ra0a       7423    5259    1421    79%    /
/dev/ra0g     195535   79183   96798    45%    /usr	<==/usr/adm/wtmp here
/dev/ra0h     205471   65181  119742    35%    /usr/users
/dev/ra1a       7423       1    6679     0%    /tmp1
/dev/ra1g     117207   70841   34645    67%    /usr/local
/dev/ra1h     283531   48580  206597    19%    /usr/spool
/dev/ra2a       7423       3    6677     0%    /tmp2
/dev/ra2d       7423       9    6671     0%    /T1
/dev/ra2e      26611     224   23726     1%    /tmp
/dev/ra2f     228015   53505  151708    26%    /T2
/dev/ra2g      38879   15978   19013    46%    /T3
/dev/ra3a       7423       9    6671     0%    /tmp3
/dev/ra3g     195535   76487   99494    43%    /uh3g
/dev/ra3h     205471   13334  171589     7%    /uh3h
/dev/ra4a       7423       9    6671     0%    /tmp4
/dev/ra4g     195535   13749  162232     8%    /uh4g
/dev/ra4h     205471    6563  178360     4%    /uh4h
/dev/ra5a       7423       9    6671     0%    /tmp5
/dev/ra5g     195535     346  175635     0%    /uh5g
/dev/ra5h     205471   65692  119231    36%    /uh5h

Very puzzling problem.  A couple of months back I heard that wtmp could
become corrupted if two users logged in at exactly the same time.
Does anyone know for sure if that is true?

BTW: I forgot to mention in my first posting that I only have
a binary license (big sigh)....todd

-- 
Todd Ogasawara, U. of Hawaii Computing Center
UUCP:		{ihnp4,seismo,ucbvax,dcdwest}!sdcsvax!nosc!uhccux!todd
ARPA:		uhccux!todd@nosc.MIL
INTERNET:	todd@uhccux.UHCC.HAWAII.EDU

dave@lsuc.UUCP (05/06/87)

In article <6554@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <469@uhccux.UUCP> todd@uhccux.UUCP (The Perplexed Wiz) writes:
>>... my wtmp file was getting data ok and then it looks like some
>>garbage got thrown into it causing an extra field to be inserted
>>between the last "good" structure field and the next entry into wtmp.
>
>This occurs when the file system fills up.  `struct utmp' is 36
>bytes long in stock 4.3BSD.  It was 20 bytes long in 4.1BSD, and
>may have been 28 bytes in 4.2BSD.  None of these are even divisors
>of any file system block size, so when the system runs out of
>blocks, a write may succeed partially, leaving a bit of a utmp
>entry behind to confuse things.

Exactly. We've had that happen a few times on our v7-based system
(wtmp record size = 20), so I keep this little script around to
fix it:
	dd if=wtmp bs=20 count=3993 of=wtmp.new1
	dd if=wtmp bs=1 skip=79872 of=wtmp.new2
following, after checking the results, by "cat wtmp.new[12] > wtmp".

Obviously, you have to find the correct numbers for this. An
interactive binary search works fine. Run things like
	dd if=wtmp bs=20 skip=3000 of=junk; who junk | more
and jump back and forth (or examine the output of "who junk")
until you narrow it down to the exact record gone awry. A little arithmetic
will then give you the numbers to plug in.

David Sherman
The Law Society of Upper Canada
Toronto
-- 
{ seismo!mnetor  cbosgd!utgpu  watmath  decvax!utcsri  ihnp4!utzoo } !lsuc!dave

grr@cbmvax.cbm.UUCP (George Robbins) (05/07/87)

In article <6554@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <469@uhccux.UUCP> todd@uhccux.UUCP (The Perplexed Wiz) writes:
>>... my wtmp file was getting data ok and then it looks like some
>>garbage got thrown into it...
>
>This occurs when the file system fills up.  `struct utmp' is 36
>bytes long in stock 4.3BSD.  It was 20 bytes long in 4.1BSD, and
>may have been 28 bytes in 4.2BSD.

	The file system filling up is definitly not required to cause
	the problem with ultrix 1.2.  Anybody have any notions on
	whether or not it might be something to do with decnet?

-- 
George Robbins - now working for,	uucp: {ihnp4|seismo|rutgers}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@seismo.css.GOV
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)