hybl@mbph.UUCP (Albert Hybl Dept of Biophysics SM) (09/16/88)
Mbph is using news_version "B 2.11 12/1/87" at patch_level 14; it is a leaf node from umbc3. Mbph is an AT&T 3B2/400 computer runing version 3.0 release 2 UNIX. It has two 72MB hard disks partitioned into file systems /, /usr, /usr2, /usr3 and /usr4. Mbph receives compressed news batches twice a day. Expire is run late each evening about an hour before polling our feed. We do not take a full feed; only comp, news, sci, bionet and a few other small groups. The news is retained 7 days. Everything seemed to be runing properly until one morning the following was scrolled on the screen of the console terminal: "Notice: Out of inodes on integral hard disk 1, partition 10." A df command showed that this was /usr4 and that there was only 1 inode available. It also showed that there were about 30000 blocks of disk memory available; each block can hold 1024 bytes. /usr4 is dedicated to the spool containing directories for the news groups. (The batched news segments are received by /usr/spool/uucp/umbc3.) /usr4 can hold a total of 49572 blocks and 6217 inodes. The du -a /usr4 | wc commands showed that slightly more that 3000 inodes were actually being used. The /usr4 file system was showing symptoms of Alzheimer's disease:-) I used the fsck command to cure the problem. Unfortunately, it was short lived--the patient has suffered at least two relapses. I have followed some of the activities during the transfer and processing of news batches. While the communication link is active as shown by using the uustat -q command, directory /usr/spool/uucp/umbc3 is slowly growing by adding X.* and D.* files while file TM.* is receiving a new file from the remote feed. After the communication link is disconnected, the X.* files are processed and the news files added to /usr4. The df command showed /usr adding freed blocks and inodes to its free lists while the block/inode counts were decreasing on /usr4. I have watched two news transfers in this way. During one of them the inode count for /usr4 suddenly jumped from about 2875 to 0! The following information shows a few lines of output produced after the jump by the df commands and a ps -ef command (edited to show only the germane processes). ------------------------------ /usr4 (/dev/dsk/c1d1sa ): 28470 blocks 1 i-nodes /usr4 (/dev/dsk/c1d1sa ): 28464 blocks 0 i-nodes /usr4 (/dev/dsk/c1d1sa ): 28470 blocks 1 i-nodes UID PID PPID C STIME TTY TIME COMMAND uucp 7919 1 0 02:48:57 ? 0:10 UUXQT -sumbc3 uucp 8605 7919 0 03:16:12 ? 0:00 sh -c PATH=/bin:/usr/bin:/usr/lbin LOGNAME=uucp UU_MACHINE=umbc3 UU_USER=root uucp 8606 8605 21 03:16:13 ? 0:02 rnews uucp 8607 8606 14 03:16:14 ? 0:04 compress -d uucp 8635 8606 40 03:17:03 ? 0:01 rnews /usr4 (/dev/dsk/c1d1sa ): 28470 blocks 1 i-nodes /usr4 (/dev/dsk/c1d1sa ): 28468 blocks 0 i-nodes /usr4 (/dev/dsk/c1d1sa ): 28470 blocks 1 i-nodes ------------------------------ QUESTIONS: Has anyone observed this behavior before? Is there a known bug in news2.11 or SysV software? Is there a patch for this problem? Please send a copy to me. Can anyone suggest some automated ways to monitor the news transfer to located the source of the problem? Do you have a shell or C program that can help with this task? PLEASE: Because we are losing some of our news, reply using e-mail. Thank you in advance. A. Hybl ---------------------------------------------------------------------- Albert Hybl, PhD. Office UUCP: uunet!mimsy!mbph!hybl Department of Biophysics Home UUCP: uunet!mimsy!mbph!hybl!ah University of Maryland Cosy: ahybl School of Medicine Baltimore, MD 21201 Phone: (301) 328-7940 (Office) ----------------------------------------------------------------------
appleton@amyerg.UUCP (Ewan Grantham) (09/17/88)
In Mr. Hybl's article, he mentions his /usr4 file system losing it's "memory" and suddenly running out of inodes. I have been encountering the same problem with my /usr system. I have a 3b2/300, am running news 2.11 patchlevel 14, and take a full newsfeed. Any help would be GREATLY appreciated by myself, and I'm sure by Mr. Hybl also. -- Ewan Grantham (601) 354-6454 ext. 452 Miss. Dept. of Corrections {}!swbatl!amyerg!egranthm (The Prison People) My bosses aren't responsible for me, and vice versa Go Broncos!!!
smb@ulysses.homer.nj.att.com (Steven Bellovin[jsw]) (09/17/88)
Check /usr/spool/news/.rnews for zillions of 0-length files....
gwyn@smoke.ARPA (Doug Gwyn ) (09/18/88)
In article <406@amyerg.UUCP> appleton@amyerg.UUCP (Ewan Grantham) writes: >In Mr. Hybl's article, he mentions his /usr4 file system losing it's >"memory" and suddenly running out of inodes. I have been encountering the >same problem with my /usr system. I have a 3b2/300, am running news >2.11 patchlevel 14, and take a full newsfeed. That sounds like the kernel bug in the UFS code that was thoroughly dicussed on comp.bugs.sys5 a few months ago. I suggest nagging AT&T for a fix. Someone there MUST have heard about the problem and its solution by now...
news@entire.UUCP (Usenet Administrator) (09/19/88)
Yeah, me too: I have to umount and mount the disk once a day, then the inodes come back. Please include me on any mailed patches to this one! -- John A. Gallant UUCP: {..}!rochester!rocksanne!entire!news Defacto Usenet News Administrator news@entire.uucp Entire Inc. Get your facts first, and then you can distort them as much as you please. Mark Twain
hybl@mbph.UUCP (Albert Hybl Dept of Biophysics SM) (09/22/88)
In Message-ID: <8520@smoke.ARPA> at Organization: Ballistic Research Lab (BRL), APG, MD. writes: # In article <406@amyerg.UUCP> appleton@amyerg.UUCP (Ewan Grantham) writes: # >In Mr. Hybl's article, he mentions his /usr4 file system losing it's # >"memory" and suddenly running out of inodes. # # That sounds like the kernel bug in the UFS code that was thoroughly # dicussed on comp.bugs.sys5 a few months ago. I suggest nagging AT&T # for a fix. Someone there MUST have heard about the problem and its # solution by now... # Reply-To: gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) Message-ID: <10611@ulysses.homer.nj.att.com> from Organization: AT&T Bell Laboratories, Murray Hill contained the following suggestion: > Check /usr/spool/news/.rnews for zillions of 0-length files.... (Steven Bellovin[jsw]) In my original posting, I stated that I used the wc command to count the number of lines produced by the du -a /usr4 command. One line is equivalent to one inode. Since this number was almost 3000 inodes short of full, I concluded that the file system had lost its inodes. "marbles :-)" ===============Summary of E-mail responses================ AS TO THE PROBABLE CAUSE: "This is an OLD problem common to System V implementations of UNIX. The problem resides within the kernel's inode allocation and release mechanism." From: uunet!cooper.cooper.EDU!mayer (Mayer Ilovitz) He also gave an excellent description of the problem, provided details to demonstrate it in a repeatable manner and a hint on how to fix it if you have access to the kernel sources. Paraphrased from Mayer Ilovitz mayer@cooper.cooper.edu HOW TO FIX THE PROBLEM a) Fix the source, if you have it. b) Hack (using umount, fsck and mount). The suggestions I received are essentially similar to the example posted in Message-ID: <536@happym.UUCP> From: kent@happym.UUCP (Kent Forschmiedt) c) Nag AT&T for a fix. SOME CLOSING COMMENTS First, let me thank all those individuals who replied to my plea for assistance. I will probably adopt a hacked solution. It will be specific to news2.11 and not address the basic problem in the kernel. The best solution would be for AT&T to fix the kernel code and issue a "recall" to all sysV UNIX users. Until there is a general fix, the Alzheimer's syndrome will reappear every month or two as new usenet sites rediscover it. Based upon the number of "me too" messages and postings, it is not an isolated problem. In our case, only the posting of new news is affected; the average reader is not likely to know of the problem unless he/she is waiting for a lost posting. More disruption will occur if the news directories are on the same file system as general user $HOME directories. Surely the sysV bug is a disaster waiting to happen. As the number UNIX systems continue to grow, the scope of applications will broaden. We can live with an occasional loss of news. However, there will be an application that will experience the sudden unanticipated disappearance of inodes with tragic consequences. This could be at DOD, NSA, CPHA, NASA, SSA or your local bank--who knows. ---------------------------------------------------------------------- Albert Hybl, PhD. Office UUCP: uunet!mimsy!mbph!hybl Department of Biophysics Home UUCP: uunet!mimsy!mbph!hybl!ah University of Maryland Cosy: ahybl School of Medicine Baltimore, MD 21201 Phone: (301) 328-7940 (Office) ----------------------------------------------------------------------
hybl@mbph.UUCP (Albert Hybl Dept of Biophysics SM) (09/24/88)
Message-ID: <10611@ulysses.homer.nj.att.com> from Organization:
AT&T Bell Laboratories, Murray Hill contained the following suggestion:
> Check /usr/spool/news/.rnews for zillions of 0-length files....
(Steven Bellovin[jsw])
Using "ls -CFa /usr/spool/news/.rnews", I get:
/usr/spool/news/.rnews: No such file or directory
Using "ls -CFa /usr4/spool/news/.rnews", I get:
./ ../
Message-ID: <585@elandes.UUCP> posted in Newsgroups: comp.bugs.sys5,
comp.unix.questions, comp.unix.xenix by Dave Mathis at UUCP
oliveb!elandes!dave discusses the "Subject: inode amnesia,
no disk space and uucico". It makes interesting reading and
appears to be directly related to the lost_inode problem
described in the Alzheimer's syndrome messages.
----------------------------------------------------------------------
Albert Hybl, PhD. Office UUCP: uunet!mimsy!mbph!hybl
Department of Biophysics Home UUCP: uunet!mimsy!mbph!hybl!ah
University of Maryland Cosy: ahybl
School of Medicine
Baltimore, MD 21201 Phone: (301) 328-7940 (Office)
----------------------------------------------------------------------
allbery@ncoast.UUCP (Phil Smith) (09/25/88)
As quoted from <10611@ulysses.homer.nj.att.com> by smb@ulysses.homer.nj.att.com (Steven Bellovin[jsw]):
+---------------
| Check /usr/spool/news/.rnews for zillions of 0-length files....
+---------------
Nope. There's a bug in the System V in-core inode caching code which causes
the system to think it's out of inodes when all it has to do is start at the
top of the cache linked list. The details can no doubt be supplied by some
unix.wizard out there, including a patch which you can apply if you have
source or mail to your vendor if you don't.
++Brandon
--
Brandon S. Allbery, uunet!marque!ncoast!allbery DELPHI: ALLBERY
For comp.sources.misc send mail to ncoast!sources-misc
"Don't discount flying pigs before you have good air defense." -- jvh@clinet.FI
pjh@mccc.UUCP (Pete Holsberg) (09/26/88)
Here's what I found in my archives about the inode problem: Path: mccc!princeton!rutgers!cmcl2!phri!cooper!mayer From: mayer@cooper.cooper.EDU (Mayer Ilovitz ) Newsgroups: comp.sys.att,comp.unix.wizards Subject: Analysis & test for 3b inode problem: applies to ALL users of SYSTEM V Keywords: 3b, SYSTEM V, inodes Message-ID: <1133@cooper.cooper.EDU> Date: 24 Nov 87 20:06:46 GMT Organization: The Cooper Union (NY, NY) Lines: 233 Since I haven't seen anyone post a full description of the problem or a test for it, here is my contribution. This document contains what I believe is a complete analysis of the System V inode allocation system and the problem that everyone is having. I have included a test procedure which should detect the problem on a UNIX system and included a program that will help you perform the tests. Also I have some suggestions on properly fixing the bug. To begin with, let me describe what I have available to me. We have a number of pretty standard Unix-PCs running System V 3.5 and System V 3.0 . We have a pair of 3b2-400 s running System V 3.0 Version 2. These machines each have a floppy diskette system. We also have an OLD 3b5 running System V Release 2 Version 2. I have access to the source for the Unix on the 3b5 and the 3b2 systems. Our 3b5 runs our newsfeed using the rnews package. This system has suffered the inode problems that everyone has been mentioning on the net for the last few weeks. Since this system has no expendable files systems, I ran the tests on the 3b2 and the Unix-PC . Both of these systems showed the same error. From this, I suspect that all versions of ATT System V unix have the problem. Furthermore, this problem may very well be in any ATT System-V compatible version of Unix and may well have been present in System-III Unix. I therefore suggest, just to be on the safe side, that you run the test described below. The analysis and test was based on the source from the 3b5. A cursory examination of the source to the 3b2 showed the code to be essentially the same in the critical area though there are what appear to be minor cosmetic changes. For those of you with access to the souces, The file that needs changing is called alloc.c or s5alloc.c . If you don't have a file by this name, look for a file that closely matches one of these names. The function that is causing the problem is called s5ialloc() or ialloc() . As far as I can tell ialloc and ifree are the low level inode allocation control system. When an inode is needed, a call to ialloc() is made. When a file/directory is deleted, ifree() is used to release the inode. These 2 functions use certain parameters that are kept in the superblock of every file system. tinode is the total number of free inodes in a file system. To speed up inode allocation and freeing, the superblock maintains a table of free inodes. This table is called inode[]. The size of this table is given by the #defined value NICINOD and is usually 100. ninode specifies the number of free inodes available in inode[]. When ifree() releases an inode, it first checks to see if the inode table is full. If it isn't, the inode is added to the top of the table and ninode is adjusted. If the table is full and the inode being released is less than the inode stored in inode[0], the newly released inode is put into inode[0]. In this way, the allocation system knows where in the i-list a group of free inodes are likely to be. When ialloc() is called, it tries to give the requesting process an inode from inode[]. If none are available, ialloc() searches the i-list for more free inodes to reload inode[]. ialloc() will start this search begining at the location of the last allocated inode as indicated by inode[] and ninode. The search continues untill NICINOD inodes are located or the end of the i-list is reached. inode[] will be reloaded from the top of the table working down to inode[0]. A mark is put in inode[] if less than 100 nodes were found. The next time inode[] runs out of nodes, this mark tells it to search the i-list from the very begining. If NO inodes were found during the search, ninode is SET TO 0 and the out of inodes error is printed on the system console. The problem that everyone is having is caused by the following situation. At the last reloading of inode[] exactly NICINOD inodes were found. Therefore, the inode at inode[0] is where the next search for inodes will begin. As the system runs, more inodes are allocated and freed. Eventually, the last free inode in inode[] is allocated. The system waits until the next call to ialloc to determine if it needs to reload inode[]. If a node is released before the inode table is reloaded, the freed inode will go into inode[0], replacing the old value which would be used for searching the i-list. If the freed inode was higher in the i-list than the one it replaced in the table, ialloc will no longer know that it should check the lower portion of the i-list for free inodes. It will think that everything below inode[0] is allocated already. If a significant number of lower valued inodes are not freed before ialloc has to reload the inode table, ialloc will fail to find any free inodes even though they exist. Furthermore, because of the coding of ialloc(), unless an inode is freed at some point, every time it tries looking for more inodes, it will start at the same place. So until the file system is dismounted and fsck'd, unless some inodes are freed, the system will be stuck repeating the same search and reporting the same failure. The original intent of the ialloc() - ifree() system is to minimize the time to find more free nodes by remembering the best location to start searching for more free inodes. Therefore, the best fix to ialloc would be to first try to give the requesting process a free node. ialloc() should then IMMEDIATELY check to see if that was the last free inode it had, and if it was, try reloading the inode table right then. This will prevent the possibility of the system from forgetting about the best place to search for inodes. A side result of this is that the out of inodes message will appear when the last free inode is allocated and not when ialloc failed to give an inode. An argument could be made either way as to wether this side effect is good or not. The other fix is to put a kludge into ialloc that, in the event that NO free inodes were found, it would immediately recycle through the i-list from the very beginning looking for inodes before deciding that there are no free inodes left. If the i-list is large, this can be somewhat inefficient. PROCEDURE TO TEST FOR THE 3B INODE ALLOCATION BUG This test is intended to be run on a floppy-based file system or an expendable file system. It is assumed that NICINOD, the number of inodes that are stored in the superblock inode table is 100. If not, the test will have to be adjusted accordingly. 1. create a file system with ~ 280 inodes using mkfs fsck the disk and mount it /mnt 2. verify with a df -t as to the number of free inodes and the total number of inodes in this file system. 3. allocate all the inodes on this filesystem. You can use the program fillnode given at the bottom of this document to help you do the job. The final result is that there should be 0 inodes left. Each file that you made on this disk should be named after its respective inode. 4. unmount the filesystem, do an fsck of the disk, remount and verify with a df -t that there are no free inodes. 5. free up the files with inodes 3-202. This will give you 200 free inodes on the filesystem. Verify this using step 4. 6. at this point, the file system will be mounted and the superblock inode table will contain inodes 3-102 for immediate allocation. 7. use fillnode to reallocate inodes 3-102. at this point you will have 100 free inodes when you do a df. This is the correct number of free nodes. At this point the superblock inode table will be empty. 8. use fillnode to allocate 1 inode. the inode that will be allocated is inode # 103. At this point the superblock inode table will have been reloaded from the i-list. the 0 element in the table will be inode 202 and the 99th element will be inode 103, which you just allocated. 9. Delete in order the files with inodes 30-39. At this point, the 0 element in the inode table will be inode 31 while the 99th element will be inode 30. When you released inode 30, the table was not full, so it was put onto the top of the table. When inode 31 was released, the table was full so ifree checked to see if the just freed inode was less than the inode in the 0th element of the table. Since the 0th element up to this time was 202, ifree replaced the 0th element with inode 31. Note, The inode table is now full, containing 100 free inodes, the lowest free inode in the entire i-list being in the 0th element of the table. As you release inodes 32-39, they will fail the test by ifree, the result being that these inodes ARE free but simply aren't in the inode table. This is alright since when ialloc must reload its inode table it will start looking with the inode referenced in the 0th element of the table. 10. allocate another 100 inodes. fillnode will allocate in order inodes 30,104-201 and inode 31. At this point the superblock inode table is empty again. However, as always, ialloc will leave the table empty until it must allocate an inode and finds no inodes in the table. 11. free inode 240. At this point you have sealed your doom ! . ifree will put this inode into the lowest available entry in the inode table, DESTROYING ANY MEMORY THAT THE LOWEST FREE INODE IS AROUND INODE 31. 12. Do a df -t to confirm that you still have ~ 10 free inodes. 13. allocate an inode. This inode will be inode 240. 14. Do a df -t to confirm that you now have ~ 9 free inodes. 15. Call fillnode again and say goodbye to your free inodes! At this point you will get an out of inodes error on your console and the allocation attempt will return failure. A df -t say that there are NO free inodes. What happened was that after step 13 there were no free nodes in the superblock inode table. At this point, ialloc went searching through the i-list for more free inodes starting at the inode specified in the 0th element of the inode table. BUT this no longer references inode 31, where we know there is more space, but inode 240. ialloc searches from inode 240 to the end of the i-list, but all those inodes are allocated, so ialloc decides that there are no more free inodes and reports the out of inodes error,EVEN though you still have free inodes!. 16. unmount the filesystem. Do an fsck. This will report a bad inode count in the Superblock ( Sound familiar ) which you must fix. Remount and do an df -t to confirm that you really do still have a number of free inodes. IF THE SITUATIONS DESCRIBED IN THIS TEST HAPPEN TO YOU AND YOU ARE HAVING PROBLEMS BECAUSE OF THIS BUG CONTACT YOUR ATT CUSTOMER/TECH SUPPORT REP AND REPORT THE PROBLEM below is the code for fillnode.c . This program will create a file in /mnt. The file created will be named after the inode to which it was allocated. The file will have 0 blocks allocated to it. #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> main() { int link(),open(),close(),fstat(); struct stat buf; int fd; char name[30]; if( (fd = open("/mnt/XXX",O_CREAT | O_WRONLY,0666) ) < 0 ) { printf("can't open file\n"); exit(2); } if( fstat(fd,&buf) < 0 ) { printf("error fstating file\n"); exit(3); } printf("inode is %d\n",buf.st_ino); sprintf(name,"/mnt/%d",buf.st_ino); close(fd); if( link("/mnt/XXX",name) < 0 ) { printf("can't link to new name\n"); exit(3); } if( unlink("/mnt/XXX") < 0 ) { printf("can't unlink old file /mnt/XXX\n"); exit(3); } exit(0); } Pete Holsberg UUCP: {...!rutgers!}princeton!mccc!pjh Technology Division ...!att!jonlab!mccc!pjh Mercer College CompuServe: 70240,334 1200 Old Trenton Road GEnie: PJHOLSBERG Trenton, NJ 08690 Voice: 1-609-586-4800
mayer@cooper.cooper.EDU (Mayer Ilovitz ) (09/30/88)
Because of all the postings which have appeared over the last few weeks in various newsgroups concerning mysteriously disappearing inodes which reappear after performing an fsck on the afflicted file system, I am reposting this article which will (hopefully) answer many of the questions. ( this was originally posted the week of Thanksgiving Day 1987.) Since the original posting, I have heard a rumor that the inode allocation/ free bug which is responsible for these problems will be fixed in System V 4.0. Mayer Ilovitz mayer@cooper.cooper.edu cmcl2!cooper!mayer -----------------------snip, snip, snip------------------------------------- This document contains what I believe is a complete analysis of the System V inode allocation system and the problem that everyone is having. I have included a test procedure which should detect the problem on a UNIX system and included a program that will help you perform the tests. Also I have some suggestions on properly fixing the bug. To begin with, let me describe what I have available to me. We have a number of pretty standard Unix-PCs running System V 3.5 and System V 3.0 . We have four 3b2-400's running System V 3.0 Version 2. These machines each have a floppy diskette system. We also have an OLD 3b5 running System V Release 2 Version 2. I have access to the source for the Unix on the 3b5 and the 3b2 systems. Our 3b5 runs our newsfeed using the Bnews 2.11 package. This system has suffered the inode problems that everyone has been mentioning on the net for the last few weeks. Since this system has no expendable file systems, I ran the tests on the 3b2 and the Unix-PC . Both of these systems showed the same error. From this, I suspect that all versions of ATT System V unix have the problem. Furthermore, this problem may very well be in any ATT System-V compatible version of Unix and may well have been present in System-III Unix. I therefore suggest, just to be on the safe side, that you run the test described below. The analysis and test was based on the source from the 3b5. A cursory examination of the source to the 3b2 showed the code to be essentially the same in the critical area though there are what appear to be minor cosmetic changes. For those of you with access to the souces, The file that needs changing is called alloc.c or s5alloc.c . If you don't have a file by this name, look for a file that closely matches one of these names. The function that is causing the problem is called s5ialloc() or ialloc() . Ialloc() and ifree() are the low level inode allocation control system. When an inode is needed, a call to ialloc() is made. When a file/directory is deleted, ifree() is used to release the inode. These 2 functions use certain parameters that are kept in the superblock of every file system. tinode is the total number of free inodes in a file system. To speed up inode allocation and freeing, the superblock maintains a table of free inodes. This table is called inode[]. The size of this table is given by the #defined value NICINOD and is usually 100. ninode specifies the number of free inodes available in inode[]. When ifree() releases an inode, it first checks to see if the inode table is full. If it isn't, the inode is added to the top of the table and ninode is adjusted. If the table is full and the inode being released is less than the inode stored in inode[0], the newly released inode is put into inode[0]. In this way, the allocation system knows where in the i-list a group of free inodes are likely to be. When ialloc() is called, it tries to give the requesting process an inode from inode[]. If none are available, ialloc() searches the i-list for more free inodes to reload inode[]. ialloc() will start this search begining at the location of the last allocated inode as indicated by inode[] and ninode. The search continues untill NICINOD inodes are located or the end of the i-list is reached. inode[] will be reloaded from the top of the table working down to inode[0]. A mark is put in inode[] if less than 100 nodes were found. The next time inode[] runs out of nodes, this mark tells it to search the i-list from the very begining. If NO inodes were found during the search, ninode is SET TO 0 and the out of inodes error is printed on the system console. The problem that everyone is having is caused by the following situation. At the last reloading of inode[] exactly NICINOD inodes were found. Therefore, the inode at inode[0] is where the next search for inodes will begin. As the system runs, more inodes are allocated and freed. Eventually, the last free inode in inode[] is allocated. The system waits until the next call to ialloc to determine if it needs to reload inode[]. If a node is released before the inode table is reloaded, the freed inode will go into inode[0], replacing the old value which would be used for searching the i-list. If the freed inode was higher in the i-list than the one it replaced in the table, ialloc will no longer know that it should check the lower portion of the i-list for free inodes. It will think that everything below inode[0] is allocated already. If a significant number of lower valued inodes are not freed before ialloc has to reload the inode table, ialloc will fail to find any free inodes even though they exist. Furthermore, because of the coding of ialloc(), unless an inode is freed at some point, every time it tries looking for more inodes, it will start at the same place. So until the file system is dismounted and fsck'd, unless some inodes are freed, the system will be stuck repeating the same search and reporting the same failure. The original intent of the ialloc() - ifree() system is to minimize the time to find more free nodes by remembering the best location to start searching for more free inodes. Therefore, the best fix to ialloc would be to first try to give the requesting process a free node. ialloc() should then IMMEDIATELY check to see if that was the last free inode it had, and if it was, try reloading the inode table right then. This will prevent the possibility of the system from forgetting about the best place to search for inodes. A side result of this is that the out of inodes message will appear when the last free inode is allocated and not when ialloc failed to give an inode. An argument could be made either way as to wether this side effect is good or not. The other fix is to put a kludge into ialloc that, in the event that NO free inodes were found, it would immediately recycle through the i-list from the very beginning looking for inodes before deciding that there are no free inodes left. If the i-list is large, this can be somewhat inefficient. PROCEDURE TO TEST FOR THE 3B INODE ALLOCATION BUG This test is intended to be run on a floppy-based file system or an expendable file system. It is assumed that NICINOD, the number of inodes that are stored in the superblock inode table is 100. If not, the test will have to be adjusted accordingly. 1. create a file system with ~ 280 inodes using mkfs fsck the disk and mount it /mnt 2. verify with a df -t as to the number of free inodes and the total number of inodes in this file system. 3. allocate all the inodes on this filesystem. You can use the program fillnode given at the bottom of this document to help you do the job. The final result is that there should be 0 inodes left. Each file that you made on this disk should be named after its respective inode. 4. unmount the filesystem, do an fsck of the disk, remount and verify with a df -t that there are no free inodes. 5. free up the files with inodes 3-202. This will give you 200 free inodes on the filesystem. Verify this using step 4. 6. at this point, the file system will be mounted and the superblock inode table will contain inodes 3-102 for immediate allocation. 7. use fillnode to reallocate inodes 3-102. at this point you will have 100 free inodes when you do a df. This is the correct number of free nodes. At this point the superblock inode table will be empty. 8. use fillnode to allocate 1 inode. the inode that will be allocated is inode # 103. At this point the superblock inode table will have been reloaded from the i-list. the 0 element in the table will be inode 202 and the 99th element will be inode 103, which you just allocated. 9. Delete in order the files with inodes 30-39. At this point, the 0 element in the inode table will be inode 31 while the 99th element will be inode 30. When you released inode 30, the table was not full, so it was put onto the top of the table. When inode 31 was released, the table was full so ifree checked to see if the just freed inode was less than the inode in the 0th element of the table. Since the 0th element up to this time was 202, ifree replaced the 0th element with inode 31. Note, The inode table is now full, containing 100 free inodes, the lowest free inode in the entire i-list being in the 0th element of the table. As you release inodes 32-39, they will fail the test by ifree, the result being that these inodes ARE free but simply aren't in the inode table. This is alright since when ialloc must reload its inode table it will start looking with the inode referenced in the 0th element of the table. 10. allocate another 100 inodes. fillnode will allocate in order inodes 30,104-201 and inode 31. At this point the superblock inode table is empty again. However, as always, ialloc will leave the table empty until it must allocate an inode and finds no inodes in the table. 11. free inode 240. At this point you have sealed your doom ! . ifree will put this inode into the lowest available entry in the inode table, DESTROYING ANY MEMORY THAT THE LOWEST FREE INODE IS AROUND INODE 31. 12. Do a df -t to confirm that you still have ~ 10 free inodes. 13. allocate an inode. This inode will be inode 240. 14. Do a df -t to confirm that you now have ~ 9 free inodes. 15. Call fillnode again and say goodbye to your free inodes! At this point you will get an out of inodes error on your console and the allocation attempt will return failure. A df -t say that there are NO free inodes. What happened was that after step 13 there were no free nodes in the superblock inode table. At this point, ialloc went searching through the i-list for more free inodes starting at the inode specified in the 0th element of the inode table. BUT this no longer references inode 31, where we know there is more space, but inode 240. ialloc searches from inode 240 to the end of the i-list, but all those inodes are allocated, so ialloc decides that there are no more free inodes and reports the out of inodes error,EVEN though you still have free inodes!. 16. unmount the filesystem. Do an fsck. This will report a bad inode count in the Superblock ( Sound familiar ) which you must fix. Remount and do an df -t to confirm that you really do still have a number of free inodes. IF THE SITUATIONS DESCRIBED IN THIS TEST HAPPEN TO YOU AND YOU ARE HAVING PROBLEMS BECAUSE OF THIS BUG CONTACT YOUR ATT CUSTOMER/TECH SUPPORT REP AND REPORT THE PROBLEM below is the code for fillnode.c . This program will create a file in /mnt. The file created will be named after the inode to which it was allocated. The file will have 0 blocks allocated to it. #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> main() { int link(),open(),close(),fstat(); struct stat buf; int fd; char name[30]; if( (fd = open("/mnt/XXX",O_CREAT | O_WRONLY,0666) ) < 0 ) { printf("can't open file\n"); exit(2); } if( fstat(fd,&buf) < 0 ) { printf("error fstating file\n"); exit(3); } printf("inode is %d\n",buf.st_ino); sprintf(name,"/mnt/%d",buf.st_ino); close(fd); if( link("/mnt/XXX",name) < 0 ) { printf("can't link to new name\n"); exit(3); } if( unlink("/mnt/XXX") < 0 ) { printf("can't unlink old file /mnt/XXX\n"); exit(3); } exit(0); }
peter@stca77.stc.oz (Peter Jeremy) (10/11/88)
In article <1384@cooper.cooper.EDU> mayer@cooper.cooper.EDU (Mayer Ilovitz ) writes: > Because of all the postings which have appeared over the last few weeks >in various newsgroups concerning mysteriously disappearing inodes >which reappear after performing an fsck on the afflicted file system, >I am reposting this article which will (hopefully) answer many of the >questions. ( this was originally posted the week of Thanksgiving Day 1987.) > >Since the original posting, I have heard a rumor that the inode allocation/ >free bug which is responsible for these problems will be fixed in System V 4.0. Just out of interest (I hadn't seen the problem, but I don't run any filesystems close to inode capacity so I mightn't), I tried the procedure on XENIX/286 2.2.1, and to my surprise, it passed the test. At step 15, it allocated inode 32 (I kept going and it happily allocated 33..39 and 202). It looks like this is one bug that Microsoft or SCO corrected. One interesting oddity was that when using the inodes initially (on a filesystem that has just been mkfs'd), it allocated 3..202, 211..272, 203..210. There was also a noticable delay between allocating 202 and 211, although this may just have been the system load (I was using a floppy disk for the experiment). This doesn't quite jell with my understanding of the way inode allocation is supposed to work (on a clean file system the free inode list should be sorted, and therefore inodes should be allocated in inode order). -- Peter Jeremy (VK2PJ) peter@stca77.stc.oz Alcatel-STC Australia ...!munnari!stca77.stc.oz!peter 41 Mandible St peter%stca77.stc.oz@uunet.UU.NET ALEXANDRIA NSW 2015
henry@utzoo.uucp (Henry Spencer) (10/14/88)
In article <308@stca77.stc.oz> peter@stca77.stc.oz (Peter Jeremy) writes: >...on a >clean file system the free inode list should be sorted, and therefore >inodes should be allocated in inode order). Well, do remember that there's no such thing as a "free inode list" with all the free inodes of the filesystem on it. What there is, is an in-core list of up to circa 100 free inodes. That list is only a hint to improve performance; it's the inode itself that records whether it is allocated or not. If the in-core list fills up, Unix just throws away the extra hints. If the in-core list becomes empty, Unix searches the filesystem for more free inodes to refill it. "Searches" means just what you think it does: sequential search through the inodes looking for empties. In modern Unixes an attempt is made to start the search from a different point each time, to avoid long delays. There is plenty of room for bugs and odd behavior in this scheme. (NB: SysV and BSD have each added their own wrinkles which change some of the details of the above.) -- The meek can have the Earth; | Henry Spencer at U of Toronto Zoology the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu