[comp.unix.microport] Missing inodes on V/386 & News

john@wa3wbu.UUCP (John Gayman) (07/21/89)

  I have been having a repeating problem recently with loosing inodes
in my News file system. I have a seperate file system ( /dev/dsk/1s3 )
on my 2nd hard disk for /usr/spool/news. About every other day the inode
count gets screwed up. It usually ends up being "1" free inode left.
Umounting and running fsck always fixes it up and then I'm back to
4000-6000 inodes. Has anyone had this problem with V/386 3.0Ue ?  I've
just recently started getting a full News-feed and thats when the problem
started. Prior to that I was getting about 25 groups from uunet and have
ran like that for almost a year without any problems. I'm running Bnews 2.11
patch 8 presently. Any comments are appreciated. Thanks.


					John



-- 
John Gayman, WA3WBU              |           UUCP: uunet!wa3wbu!john
1869 Valley Rd.                  |           ARPA: john@wa3wbu.uu.net 
Marysville, PA 17053             |           Packet: WA3WBU @ AK3P 

bill@twwells.com (T. William Wells) (07/23/89)

In article <456@wa3wbu.UUCP> john@wa3wbu.UUCP (John Gayman) writes:
:   I have been having a repeating problem recently with loosing inodes
: in my News file system. I have a seperate file system ( /dev/dsk/1s3 )
: on my 2nd hard disk for /usr/spool/news. About every other day the inode
: count gets screwed up. It usually ends up being "1" free inode left.
: Umounting and running fsck always fixes it up and then I'm back to
: 4000-6000 inodes. Has anyone had this problem with V/386 3.0Ue ?

It is a standard problem with SysV3.0 (actually, all SysV Unixes
earlier than 3.1, or so I'm told). I have a patch for Microport
SysV/386 3.0e, which presumably is somewhat different from 3.0Ue. I
described my patch a few months ago; the original article is appended.
While you may not be able to use the patch directly, you may be able
to figure out how to do the patch yourself.

Good luck!

---
Bill                            { uunet | novavax } !twwells!bill

Path: twwells!bill
From: bill@twwells.uucp (T. William Wells)
Newsgroups: comp.unix.microport,comp.unix.wizards,comp.bugs.sys5
Subject: Another fix for the SYSV inode problem
Message-ID: <347@twwells.uucp>
Date: 29 Jan 89 14:01:28 GMT
Reply-To: bill@twwells.UUCP (T. William Wells)
Organization: None, Ft. Lauderdale
Lines: 79

Two things happened today (really yesterday, but this fix took all
night): first, my system crashed twice due to the inode bug, and
second, I received Jim Valerio's stuff on fixing the inode problem for
System V/386 Release 3. I'm running Microport's 3.0e; and here is
what is going on.

S5ialloc (aka ialloc) has a bug. It seems that the code is dependent
on the condition that the inode cache always contains the lowest free
inode. This is a condition that just can't be met.

Jim Valerio's fix is to always scan the inode list when the cache
runs out.  I didn't like that; my system is already disk-bound and I
don't want to add more load on the disk, so I disassembled the code
and found a fix.  My fix is to ignore a failure to read inodes and
try again.  This has the advantage of not requiring a rescan except
when the inode pointer gets screwed up.

The following is relevant code from the disassembly:

define(`NICINOD',       100)

define(`s_ninode',      212(%edi))      # short  number of i-nodes in s_inode
define(`s_inode',       214(%edi))      # ushort free i-node list
define(`s_tinode',      436(%edi))      # ushort total free inodes

.readinodes:    .0xFC
	movw    s_tinode,%ax    / check to see that there are some free inodes
	testw   %ax,%ax
	je      .noinodes       / no, branch to the error handler
	movw    $NICINOD,s_ninode / this is the number of inodes we can read
	movzwl  s_inode,%eax    / the first inode to read from the disk
...

.0x209:
	movw    s_ninode,%ax    / did we get enough inodes for the cache?
	testw   %ax,%ax
	jle     .0x236          / yes, proceed
	leal    s_inode,%eax    / this is the address of the inode table
	movswl  s_ninode,%edx   / this is how many inodes we couldn't get
	decl    %edx            / stick a zero before the inodes to force
	movw    $0,(%eax,%edx,2) / a reread when they are all used up
	movw    $0,s_inode      / zero the first inode in the cache
.0x236:
	movswl  s_ninode,%eax   / if no inodes were read into the cache
	cmpl    $NICINOD,%eax
	je      .noinodes       / fail due to lack no inodes
	movw    $NICINOD,s_ninode / otherwise set the cache pointer to its end
	jmp     .0x2C5          / and then go back to allocating

---

Note these two key facts:

    1) .readinodes checks s_tinode before reading from the disk.

    2) If nothing gets put into the cache, the first entry of s_inode
       is zeroed.

Therefore, if the last `je .noinodes' is changed to `je .readinodes',
the bug goes away!  Does anyone see any problem with this patch?

---

On my system, I patched s5.o (/etc/atconf/modules/s5/s5.o) with the
following:

adb -w s5.o <<+
s5ialloc+244?W0fffffeb4
+

REMEMBER. MAKE A BACKUP COPY OF S5.O AND VERIFY THAT THE CODE LOOKS
SOMETHING LIKE MINE!!!!

Then rebuild your kernel. All kernels made after this change will
have the fix.

---
Bill
{ uunet!proxftl | novavax } !twwells!bill

randy@rls.UUCP (Randall L. Smith) (07/24/89)

In article <1989Jul22.172031.20292@twwells.com>, bill@twwells.com (T. William Wells) writes:
> In article <456@wa3wbu.UUCP> john@wa3wbu.UUCP (John Gayman) writes:
> : About every other day the inode count gets screwed up.
> 
> It is a standard problem with SysV3.0 (actually, all SysV Unixes
> earlier than 3.1, or so I'm told). I have a patch for Microport
> SysV/386 3.0e, which presumably is somewhat different from 3.0Ue. I
> described my patch a few months ago;
> 
> [... patch deleted for bervity ...]

I must attest that the methodology really works.  I was plagued with this
monster for a few weeks and finally got PO'ed enough to do something. 
(Thats what it usually takes:-}).  The process is simple too.  Just copy
(cp NOT ln or mv) your /unix kernel to another name before you do anything.

Since the method only attacks the inode segment of the kernel, the patch
works fine with DosMerge too.  Many thanks to Bill Wells!

Cheers!

- randy

Usenet: randy@rls.uucp
Bangpath: ...<backbone>!osu-cis!rls!randy
Internet: rls!randy@tut.cis.ohio-state.edu

pwilcox@paldn.UUCP (Peter McLeod Wilcox) (07/25/89)

In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes:
>   I have been having a repeating problem recently with loosing inodes
> in my News file system.

You describe both my system and my problem exactly, there was a post
a few days ago to this group about what the problem is (inode allocation
bug in System V) but the fix requires source access.  I have a kludge
which keeps the problem under control.  When I start unpacking I also
start a job which creates and deletes ~dozen files on the news file
system every 2 minutes, checking every 10-20 minutes to see if
rnews is stil running (ps & fgrep).  There is a 3.1 uPort system, but
Microport seems to finally gone completely belly up, their BBS is
disconnected again, and as soon as office hours start, their phone is
taken off the hook.  I don't know if 3.1 fixes the problem but if anyone
out there knows how to get it I would love to try.

In a nutshell, the problem occurs when the last inode has been allocated
from the inode list in the super block, if a file is deleted at that point,
its inode is placed in the free list, when it is allocated the following search
for inodes only goes from the start of the list to that point.  The kludge
above involves making sure that the free list end and search occur without
a delete between, i.e. the number of files created should be greater than the
number of articles unbatched during the sleep time.  There is still a failure
window, but the probability is reduced.

Again, if anyone out there knows how to get the 3.1 update, either from
uPort or whoever is tending their affairs, please let us know!
-- 
Pete Wilcox		...gatech!nanovx!techwood!paldn!pwilcox

dave@micropen (David F. Carlson) (07/25/89)

In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes:
> 
>   I have been having a repeating problem recently with loosing inodes
> in my News file system. I have a seperate file system ( /dev/dsk/1s3 )
> on my 2nd hard disk for /usr/spool/news. About every other day the inode
> count gets screwed up. 
> John Gayman, WA3WBU              |           UUCP: uunet!wa3wbu!john


Although it appears that fsck "fixes" the problem, it does not!
Tar out the file system to /tmp.  mkfs the file system again.  Bump up 
the inodes so that you are guaranteed to run out of blocks before inodes.
My news is is 15 meg and has 15000 inodes (yes, one per block.)  I just
got tired of this and went crazy, but it *never* has had the problem again!

-- 
David F. Carlson, Micropen, Inc.
micropen!dave@ee.rochester.edu

"The faster I go, the behinder I get." --Lewis Carroll

bill@twwells.com (T. William Wells) (07/27/89)

In article <804@micropen> dave@micropen (David F. Carlson) writes:
: In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes:
: >
: >   I have been having a repeating problem recently with loosing inodes
: > in my News file system. I have a seperate file system ( /dev/dsk/1s3 )
: > on my 2nd hard disk for /usr/spool/news. About every other day the inode
: > count gets screwed up.
: > John Gayman, WA3WBU              |           UUCP: uunet!wa3wbu!john
:
: Although it appears that fsck "fixes" the problem, it does not!
: Tar out the file system to /tmp.  mkfs the file system again.  Bump up
: the inodes so that you are guaranteed to run out of blocks before inodes.
: My news is is 15 meg and has 15000 inodes (yes, one per block.)  I just
: got tired of this and went crazy, but it *never* has had the problem again!

The problem is that there is a kernel bug in pre 3.2 SysV systems. It
causes the file system to think that there are no free inodes when
there really are. What fsck fixes is the count of free inodes in the
superblock; that permits the system to continue working. However, the
bug hasn't gone away; next time the right conditions occur bang goes
the count and away goes your file system.

Increasing the number of inodes ought to decrease the probability of
the bug occuring; it does *not* eliminate it.

As I just recently posted, this is a fairly easy thing to fix if you
are a kernel hacker. I have a patch for Microport SysV/386 3.0e. If
anyone wants it, send me e-mail. I've also heard of patches for other
systems.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

laurie@ucscb.UCSC.EDU (60648000) (08/02/89)

In article <130@paldn.UUCP> pwilcox@paldn.UUCP (Peter McLeod Wilcox) writes:
>rnews is stil running (ps & fgrep).  There is a 3.1 uPort system, but
>Microport seems to finally gone completely belly up, their BBS is
>disconnected again, and as soon as office hours start, their phone is
>taken off the hook.  I don't know if 3.1 fixes the problem but if anyone

There is no uport 3.1 unix. The most recent version is 3.0e.1 which is a uport
internal version number. It is still based on at&t SVR3.0 just like all of the
previous uport 386 unix releases. No changes where made in the s5ialloc module 
by uport in the 3.0e.1 release. I beleive that at&t fixed it in there SVR3.1 
release.

BTW, Microport is not belly up and they do not take their phone off of the 
hook. At the moment, they are a bit short handed and there aren't enough
technical people to take care of the phones and a bbs system. Give them a
chance, they'll soon pop back to life.

Ken Chapin