john@wa3wbu.UUCP (John Gayman) (07/21/89)
I have been having a repeating problem recently with loosing inodes in my News file system. I have a seperate file system ( /dev/dsk/1s3 ) on my 2nd hard disk for /usr/spool/news. About every other day the inode count gets screwed up. It usually ends up being "1" free inode left. Umounting and running fsck always fixes it up and then I'm back to 4000-6000 inodes. Has anyone had this problem with V/386 3.0Ue ? I've just recently started getting a full News-feed and thats when the problem started. Prior to that I was getting about 25 groups from uunet and have ran like that for almost a year without any problems. I'm running Bnews 2.11 patch 8 presently. Any comments are appreciated. Thanks. John -- John Gayman, WA3WBU | UUCP: uunet!wa3wbu!john 1869 Valley Rd. | ARPA: john@wa3wbu.uu.net Marysville, PA 17053 | Packet: WA3WBU @ AK3P
bill@twwells.com (T. William Wells) (07/23/89)
In article <456@wa3wbu.UUCP> john@wa3wbu.UUCP (John Gayman) writes:
: I have been having a repeating problem recently with loosing inodes
: in my News file system. I have a seperate file system ( /dev/dsk/1s3 )
: on my 2nd hard disk for /usr/spool/news. About every other day the inode
: count gets screwed up. It usually ends up being "1" free inode left.
: Umounting and running fsck always fixes it up and then I'm back to
: 4000-6000 inodes. Has anyone had this problem with V/386 3.0Ue ?
It is a standard problem with SysV3.0 (actually, all SysV Unixes
earlier than 3.1, or so I'm told). I have a patch for Microport
SysV/386 3.0e, which presumably is somewhat different from 3.0Ue. I
described my patch a few months ago; the original article is appended.
While you may not be able to use the patch directly, you may be able
to figure out how to do the patch yourself.
Good luck!
---
Bill { uunet | novavax } !twwells!bill
Path: twwells!bill
From: bill@twwells.uucp (T. William Wells)
Newsgroups: comp.unix.microport,comp.unix.wizards,comp.bugs.sys5
Subject: Another fix for the SYSV inode problem
Message-ID: <347@twwells.uucp>
Date: 29 Jan 89 14:01:28 GMT
Reply-To: bill@twwells.UUCP (T. William Wells)
Organization: None, Ft. Lauderdale
Lines: 79
Two things happened today (really yesterday, but this fix took all
night): first, my system crashed twice due to the inode bug, and
second, I received Jim Valerio's stuff on fixing the inode problem for
System V/386 Release 3. I'm running Microport's 3.0e; and here is
what is going on.
S5ialloc (aka ialloc) has a bug. It seems that the code is dependent
on the condition that the inode cache always contains the lowest free
inode. This is a condition that just can't be met.
Jim Valerio's fix is to always scan the inode list when the cache
runs out. I didn't like that; my system is already disk-bound and I
don't want to add more load on the disk, so I disassembled the code
and found a fix. My fix is to ignore a failure to read inodes and
try again. This has the advantage of not requiring a rescan except
when the inode pointer gets screwed up.
The following is relevant code from the disassembly:
define(`NICINOD', 100)
define(`s_ninode', 212(%edi)) # short number of i-nodes in s_inode
define(`s_inode', 214(%edi)) # ushort free i-node list
define(`s_tinode', 436(%edi)) # ushort total free inodes
.readinodes: .0xFC
movw s_tinode,%ax / check to see that there are some free inodes
testw %ax,%ax
je .noinodes / no, branch to the error handler
movw $NICINOD,s_ninode / this is the number of inodes we can read
movzwl s_inode,%eax / the first inode to read from the disk
...
.0x209:
movw s_ninode,%ax / did we get enough inodes for the cache?
testw %ax,%ax
jle .0x236 / yes, proceed
leal s_inode,%eax / this is the address of the inode table
movswl s_ninode,%edx / this is how many inodes we couldn't get
decl %edx / stick a zero before the inodes to force
movw $0,(%eax,%edx,2) / a reread when they are all used up
movw $0,s_inode / zero the first inode in the cache
.0x236:
movswl s_ninode,%eax / if no inodes were read into the cache
cmpl $NICINOD,%eax
je .noinodes / fail due to lack no inodes
movw $NICINOD,s_ninode / otherwise set the cache pointer to its end
jmp .0x2C5 / and then go back to allocating
---
Note these two key facts:
1) .readinodes checks s_tinode before reading from the disk.
2) If nothing gets put into the cache, the first entry of s_inode
is zeroed.
Therefore, if the last `je .noinodes' is changed to `je .readinodes',
the bug goes away! Does anyone see any problem with this patch?
---
On my system, I patched s5.o (/etc/atconf/modules/s5/s5.o) with the
following:
adb -w s5.o <<+
s5ialloc+244?W0fffffeb4
+
REMEMBER. MAKE A BACKUP COPY OF S5.O AND VERIFY THAT THE CODE LOOKS
SOMETHING LIKE MINE!!!!
Then rebuild your kernel. All kernels made after this change will
have the fix.
---
Bill
{ uunet!proxftl | novavax } !twwells!bill
randy@rls.UUCP (Randall L. Smith) (07/24/89)
In article <1989Jul22.172031.20292@twwells.com>, bill@twwells.com (T. William Wells) writes: > In article <456@wa3wbu.UUCP> john@wa3wbu.UUCP (John Gayman) writes: > : About every other day the inode count gets screwed up. > > It is a standard problem with SysV3.0 (actually, all SysV Unixes > earlier than 3.1, or so I'm told). I have a patch for Microport > SysV/386 3.0e, which presumably is somewhat different from 3.0Ue. I > described my patch a few months ago; > > [... patch deleted for bervity ...] I must attest that the methodology really works. I was plagued with this monster for a few weeks and finally got PO'ed enough to do something. (Thats what it usually takes:-}). The process is simple too. Just copy (cp NOT ln or mv) your /unix kernel to another name before you do anything. Since the method only attacks the inode segment of the kernel, the patch works fine with DosMerge too. Many thanks to Bill Wells! Cheers! - randy Usenet: randy@rls.uucp Bangpath: ...<backbone>!osu-cis!rls!randy Internet: rls!randy@tut.cis.ohio-state.edu
pwilcox@paldn.UUCP (Peter McLeod Wilcox) (07/25/89)
In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes: > I have been having a repeating problem recently with loosing inodes > in my News file system. You describe both my system and my problem exactly, there was a post a few days ago to this group about what the problem is (inode allocation bug in System V) but the fix requires source access. I have a kludge which keeps the problem under control. When I start unpacking I also start a job which creates and deletes ~dozen files on the news file system every 2 minutes, checking every 10-20 minutes to see if rnews is stil running (ps & fgrep). There is a 3.1 uPort system, but Microport seems to finally gone completely belly up, their BBS is disconnected again, and as soon as office hours start, their phone is taken off the hook. I don't know if 3.1 fixes the problem but if anyone out there knows how to get it I would love to try. In a nutshell, the problem occurs when the last inode has been allocated from the inode list in the super block, if a file is deleted at that point, its inode is placed in the free list, when it is allocated the following search for inodes only goes from the start of the list to that point. The kludge above involves making sure that the free list end and search occur without a delete between, i.e. the number of files created should be greater than the number of articles unbatched during the sleep time. There is still a failure window, but the probability is reduced. Again, if anyone out there knows how to get the 3.1 update, either from uPort or whoever is tending their affairs, please let us know! -- Pete Wilcox ...gatech!nanovx!techwood!paldn!pwilcox
dave@micropen (David F. Carlson) (07/25/89)
In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes: > > I have been having a repeating problem recently with loosing inodes > in my News file system. I have a seperate file system ( /dev/dsk/1s3 ) > on my 2nd hard disk for /usr/spool/news. About every other day the inode > count gets screwed up. > John Gayman, WA3WBU | UUCP: uunet!wa3wbu!john Although it appears that fsck "fixes" the problem, it does not! Tar out the file system to /tmp. mkfs the file system again. Bump up the inodes so that you are guaranteed to run out of blocks before inodes. My news is is 15 meg and has 15000 inodes (yes, one per block.) I just got tired of this and went crazy, but it *never* has had the problem again! -- David F. Carlson, Micropen, Inc. micropen!dave@ee.rochester.edu "The faster I go, the behinder I get." --Lewis Carroll
bill@twwells.com (T. William Wells) (07/27/89)
In article <804@micropen> dave@micropen (David F. Carlson) writes: : In article <456@wa3wbu.UUCP>, john@wa3wbu.UUCP (John Gayman) writes: : > : > I have been having a repeating problem recently with loosing inodes : > in my News file system. I have a seperate file system ( /dev/dsk/1s3 ) : > on my 2nd hard disk for /usr/spool/news. About every other day the inode : > count gets screwed up. : > John Gayman, WA3WBU | UUCP: uunet!wa3wbu!john : : Although it appears that fsck "fixes" the problem, it does not! : Tar out the file system to /tmp. mkfs the file system again. Bump up : the inodes so that you are guaranteed to run out of blocks before inodes. : My news is is 15 meg and has 15000 inodes (yes, one per block.) I just : got tired of this and went crazy, but it *never* has had the problem again! The problem is that there is a kernel bug in pre 3.2 SysV systems. It causes the file system to think that there are no free inodes when there really are. What fsck fixes is the count of free inodes in the superblock; that permits the system to continue working. However, the bug hasn't gone away; next time the right conditions occur bang goes the count and away goes your file system. Increasing the number of inodes ought to decrease the probability of the bug occuring; it does *not* eliminate it. As I just recently posted, this is a fairly easy thing to fix if you are a kernel hacker. I have a patch for Microport SysV/386 3.0e. If anyone wants it, send me e-mail. I've also heard of patches for other systems. --- Bill { uunet | novavax | ankh | sunvice } !twwells!bill bill@twwells.com
laurie@ucscb.UCSC.EDU (60648000) (08/02/89)
In article <130@paldn.UUCP> pwilcox@paldn.UUCP (Peter McLeod Wilcox) writes: >rnews is stil running (ps & fgrep). There is a 3.1 uPort system, but >Microport seems to finally gone completely belly up, their BBS is >disconnected again, and as soon as office hours start, their phone is >taken off the hook. I don't know if 3.1 fixes the problem but if anyone There is no uport 3.1 unix. The most recent version is 3.0e.1 which is a uport internal version number. It is still based on at&t SVR3.0 just like all of the previous uport 386 unix releases. No changes where made in the s5ialloc module by uport in the 3.0e.1 release. I beleive that at&t fixed it in there SVR3.1 release. BTW, Microport is not belly up and they do not take their phone off of the hook. At the moment, they are a bit short handed and there aren't enough technical people to take care of the phones and a bbs system. Give them a chance, they'll soon pop back to life. Ken Chapin