wayne@i-core.UUCP (Wayne H Cox) (06/15/89)
Well, I finally gave up and did something about the inode bug that exists in Microport System V/AT UNIX. Two months ago, I had a major crash on my news file system. I spent several hours repairing it with 'fsck' and 'fsdb'. When I was finished, I spent a couple of hours making sure that I would never again enjoy such fun. Below are the results of my effort. I have been running this patch for two months, and not a single squeak out of the inode list. I am using release 2.4 of the OS, your mileage may vary. You must have the software development system and the link kit from Microport to use this patch. Working from a previous posting by twwells!bill that included an assembly listing of the affected areas: > > The following is relevant code from the disassembly: > > define(`NICINOD', 100) > > define(`s_ninode', 212(%edi)) # short number of i-nodes in s_inode > define(`s_inode', 214(%edi)) # ushort free i-node list > define(`s_tinode', 436(%edi)) # ushort total free inodes > > .readinodes: .0xFC > movw s_tinode,%ax / check to see that there are some free inodes > testw %ax,%ax > je .noinodes / no, branch to the error handler > movw $NICINOD,s_ninode / this is the number of inodes we can read > movzwl s_inode,%eax / the first inode to read from the disk > ... > > .0x209: > movw s_ninode,%ax / did we get enough inodes for the cache? > testw %ax,%ax > jle .0x236 / yes, proceed > leal s_inode,%eax / this is the address of the inode table > movswl s_ninode,%edx / this is how many inodes we couldn't get > decl %edx / stick a zero before the inodes to force > movw $0,(%eax,%edx,2) / a reread when they are all used up > movw $0,s_inode / zero the first inode in the cache > .0x236: > movswl s_ninode,%eax / if no inodes were read into the cache > cmpl $NICINOD,%eax > je .noinodes / fail due to lack no inodes > movw $NICINOD,s_ninode / otherwise set the cache pointer to its end > jmp .0x2C5 / and then go back to allocating I began by extracting alloc.o from lib1 in the linkkit files. I ran a name list from the object file to locate the start and end of the routine called 'ialloc'. I ran a dis-assembly on the object file and removed assorted cruft with an editor. This is what I came up with: ialloc: 352: c8 18 00 00 enter $0x18,$0x0 356: ff 76 06 push 0x6(%bp) 359: 9a a4 06 60 00 lcall 0x060,0x6a4 35e: 44 inc %sp 35f: 44 inc %sp 360: 89 46 fc mov %ax,0xfc(%bp) 363: 89 56 fe mov %dx,0xfe(%bp) 366: e9 32 02 jmp 0x232 <59b> 369: 6a 0a push $0xa 36b: 8b 46 fc mov 0xfc(%bp),%ax 36e: 05 9b 01 add $0x19b,%ax 371: ff 76 fe push 0xfe(%bp) 374: 50 push %ax 375: 9a 00 00 00 00 lcall 0x00,0x00 37a: 83 c4 06 add $0x6,%sp 37d: e9 1b 02 jmp 0x21b <59b> 380: c5 76 fc lds 0xfc(%bp),%si 383: f7 84 d0 00 ff ff test $0xffff,0x0d0(%si) 389: 7f 03 jg 0x3 <38e> ; Read inodes 38b: e9 c2 00 jmp 0x0c2 <450> . . . 556: 9a 00 00 00 00 lcall 0x00,0x00 55b: 83 c4 04 add $0x4,%sp 55e: c5 76 fc lds 0xfc(%bp),%si 561: f7 84 d0 00 ff ff test $0xffff,0x0d0(%si) 567: 7e 1f jle 0x1f <588> 569: c5 76 fc lds 0xfc(%bp),%si 56c: 8b b4 d0 00 mov 0x0d0(%si),%si 570: 4e dec %si 571: d1 e6 shl %si 573: 03 76 fc add 0xfc(%bp),%si 576: 8e 5e fe mov 0xfe(%bp),%ds 579: c7 84 d2 00 00 00 mov $0x0,0x0d2(%si) 57f: c5 76 fc lds 0xfc(%bp),%si 582: c7 84 d2 00 00 00 mov $0x0,0x0d2(%si) 588: c5 76 fc lds 0xfc(%bp),%si 58b: 83 bc d0 00 64 cmp $0x64,0x0d0(%si) 590: 74 19 je 0x19 <5ab> ; Fail if none 592: c5 76 fc lds 0xfc(%bp),%si 595: c7 84 d0 00 64 00 mov $0x64,0x0d0(%si) 59b: c5 76 fc lds 0xfc(%bp),%si 59e: f6 84 9b 01 ff testb $0xff,0x19b(%si) 5a3: 75 03 jne 0x3 <5a8> 5a5: e9 d8 fd jmp 0xfdd8 <380> 5a8: e9 be fd jmp 0xfdbe <369> 5ab: c5 76 fc lds 0xfc(%bp),%si 5ae: c7 84 d0 00 00 00 mov $0x0,0x0d0(%si) 5b4: ff 76 06 push 0x6(%bp) 5b7: 68 90 00 push $0x90 5ba: 68 31 09 push $0x931 5bd: 9a 00 00 00 00 lcall 0x00,0x00 5c2: 83 c4 06 add $0x6,%sp 5c5: b8 00 00 mov $0x0,%ax 5c8: 8e d8 mov %ax,%ds 5ca: c6 06 55 04 1c movb $0x1c,0x455 5cf: c5 76 fc lds 0xfc(%bp),%si 5d2: c7 84 ae 01 00 00 mov $0x0,0x1ae(%si) 5d8: 33 c0 xor %ax,%ax 5da: 33 d2 xor %dx,%dx 5dc: c9 leave 5dd: cb lret I came up with the following patch: ialloc: 352: c8 18 00 00 enter $0x18,$0x0 356: ff 76 06 push 0x6(%bp) 359: 9a a4 06 60 00 lcall 0x060,0x6a4 35e: 44 inc %sp 35f: 44 inc %sp 360: 89 46 fc mov %ax,0xfc(%bp) 363: 89 56 fe mov %dx,0xfe(%bp) 366: e9 32 02 jmp 0x232 <59b> 369: 6a 0a push $0xa 36b: 8b 46 fc mov 0xfc(%bp),%ax 36e: 05 9b 01 add $0x19b,%ax 371: ff 76 fe push 0xfe(%bp) 374: 50 push %ax 375: 9a 00 00 00 00 lcall 0x00,0x00 37a: 83 c4 06 add $0x6,%sp 37d: e9 1b 02 jmp 0x21b <59b> 380: c5 76 fc lds 0xfc(%bp),%si 383: f7 84 d0 00 ff ff test $0xffff,0x0d0(%si) 389: 7f 03 jg 0x3 <38e> ; Read inodes 38b: e9 c2 00 jmp 0x0c2 <450> . . . 556: 9a 00 00 00 00 lcall 0x00,0x00 55b: 83 c4 04 add $0x4,%sp 55e: c5 76 fc lds 0xfc(%bp),%si 561: f7 84 d0 00 ff ff test $0xffff,0x0d0(%si) 567: 7e 1f jle 0x1f <588> 569: c5 76 fc lds 0xfc(%bp),%si 56c: 8b b4 d0 00 mov 0x0d0(%si),%si 570: 4e dec %si 571: d1 e6 shl %si 573: 03 76 fc add 0xfc(%bp),%si 576: 8e 5e fe mov 0xfe(%bp),%ds 579: c7 84 d2 00 00 00 mov $0x0,0x0d2(%si) 57f: c5 76 fc lds 0xfc(%bp),%si 582: c7 84 d2 00 00 00 mov $0x0,0x0d2(%si) 588: c5 76 fc lds 0xfc(%bp),%si 58b: 83 bc d0 00 64 cmp $0x64,0x0d0(%si) 590: 74 13 je 0x13 <5a5> ; Change here 592: c5 76 fc lds 0xfc(%bp),%si 595: c7 84 d0 00 64 00 mov $0x64,0x0d0(%si) 59b: c5 76 fc lds 0xfc(%bp),%si 59e: f6 84 9b 01 ff testb $0xff,0x19b(%si) 5a3: 75 03 jne 0x3 <5a8> 5a5: e9 d8 fd jmp 0xfdd8 <380> ; Jump to read 5a8: e9 be fd jmp 0xfdbe <369> 5ab: c5 76 fc lds 0xfc(%bp),%si 5ae: c7 84 d0 00 00 00 mov $0x0,0x0d0(%si) 5b4: ff 76 06 push 0x6(%bp) 5b7: 68 90 00 push $0x90 5ba: 68 31 09 push $0x931 5bd: 9a 00 00 00 00 lcall 0x00,0x00 5c2: 83 c4 06 add $0x6,%sp 5c5: b8 00 00 mov $0x0,%ax 5c8: 8e d8 mov %ax,%ds 5ca: c6 06 55 04 1c movb $0x1c,0x455 5cf: c5 76 fc lds 0xfc(%bp),%si 5d2: c7 84 ae 01 00 00 mov $0x0,0x1ae(%si) 5d8: 33 c0 xor %ax,%ax 5da: 33 d2 xor %dx,%dx 5dc: c9 leave 5dd: cb lret I used a program called bpatch (a hex editor) to change the byte at offset 0x591 from 0x19 to 0x13. What this does is when a disk read fails to return any inodes, it jumps to the read routine again. The old code simply called the error routine to fail. Place the new alloc.o file into lib1 and remake the kernel. Remember to keep a backup copy of the original alloc.o in case this patch doesn't work. Note this only fixes the inode bug. FSCK is still broken when used with large filesystems and the proper measures must be taken when checking them. --Wayne -- Wayne H Cox Ph.D. --- --- --- UUCP: {...}!uunet!iconsys!caeco!i-core!wayne Inland | | | Internet: wayne@i-core.UUCP Innovations, | | | ICBM: 40 39 04 N / 111 56 12 W Inc. --- --- --- UTAH: Our taxes are as high as our mountains