ghelmer@dsuvax.uucp (Guy Helmer) (09/22/90)
I just built the 386 version of MINIX, and I've been searching for the source of this error for several hours now. I receive a general protection from process number 1, pc = 0x0007:0x00000385 and the friendly "Kernel panic: exception in kernel, mm, or fs" message immediately after pressing the '=' key at the boot menu. I'm trying to run this without shoelace or db. I've tried this on two very different 386 boxes with identical results from both, so I must have done something wrong while building the system. I've re-built the kernel several times, as well as the various tools (build, init, bootblok). bootblok has the patches to copy itself very high in memory before loading the rest of the o/s, so I don't believe it's related to the tools. For the gurus that have built 386 kernels that work, am I right in believing that the code segment of the above address (0x0007) is the segment descriptor for the BIOS code segment that the BIOS uses in INT 0x15 function 0x89? Right now I think perhaps a GDT entry isn't being set up correctly, but that's an uneducated guess :-( Thanks for any help! -- Guy Helmer work: DSU Computing Services, Business & Education Institute (605) 256-5315 play: MidIX System Support Services (605) 256-2788 helmer@sdnet.bitnet, ghelmer@dsuvax.uucp, uunet!loft386!dsuvax!ghelmer
awb@almond.ed.ac.uk (Alan W Black) (09/23/90)
In article <1990Sep22.055445.15470@dsuvax.uucp> ghelmer@dsuvax.uucp (Guy Helmer) writes: >I just built the 386 version of MINIX, and I've been searching for >the source of this error for several hours now. I receive a >general protection from process number 1, pc = 0x0007:0x00000385 >and the friendly "Kernel panic: exception in kernel, mm, or fs" >message immediately after pressing the '=' key at the boot menu. >I'm trying to run this without shoelace or db. > > [details deleted] > >Guy Helmer >work: DSU Computing Services, Business & Education Institute (605) 256-5315 >play: MidIX System Support Services (605) 256-2788 >helmer@sdnet.bitnet, ghelmer@dsuvax.uucp, uunet!loft386!dsuvax!ghelmer I recently did this on a 386 machine. After a lot of searching and debugging we discovered that the system corrupts itself when booting. We basically we had two problem one ours and one caused by a problem in bootblok.s (as distributed -- I got mine from plains.nodak.edu) If the size of the kernel is one sector bigger that the number of tracks being loaded, it misses the last sector. The fix is something like (sorry I don't have the actual bootblok.s here) at the end of the loop where it is loading sectors mov ax,disksec | see if we are done loading cmp ax,final | ditto jb load | jump if there is more to load You should (I believe) decrement disksec before doing the test. I can't remember if this is right but it definately in this area and after we changed this it worked. Now it happens that if you build a kernel with the small number of disk buffers (in minix/config.h) i.e. not the INTEL_32BITS default but the old default. It will work without the above change. The other problem (which I actually think is what you are getting first). There were no instructions on how to build the default db (x386_1.1/tools/db.s) I originally simply bcc'd it but it seems it should be that you build the .o file then ld it on its own without the library. (thanks to Richard Tobin for doing most of this debugging) Hope this doesn't just confuse the issue Alan Alan W Black 80 South Bridge, Edinburgh, UK Dept of Artificial Intelligence tel: (+44) -31 225 7774 x228 or x223 University of Edinburgh email: awb@ed.ac.uk
ghelmer@dsuvax.uucp (Guy Helmer) (09/26/90)
In <1990Sep22.055445.15470@dsuvax.uucp> ghelmer@dsuvax.uucp (Guy Helmer) writes: >I just built the 386 version of MINIX, and I've been searching for >the source of this error for several hours now. I receive a >general protection from process number 1, pc = 0x0007:0x00000385 >and the friendly "Kernel panic: exception in kernel, mm, or fs" >message immediately after pressing the '=' key at the boot menu. >I'm trying to run this without shoelace or db. I quit trying to run without db. Last night I tried to debug 386 Minix for a few hours and only decided that the general protection exception was happening after the TTY proc was initialized but before MM was initialized. I didn't have time to track it down farther. This morning, though, I rebuilt everything after changing the number of buffers in the cache to 30, and 386 Minix started without any complaints. I'd like to find the source of the trouble when using large numbers of buffers and the plain Minix bootstrap. I guess I'll have to figure out how to use shoelace to boot Minix so I can have lots of cache until I get this problem figured out. >Thanks for any help! Thanks to everyone who responded! -- Guy Helmer work: DSU Computing Services, Business & Education Institute (605) 256-5315 play: MidIX System Support Services (605) 256-2788 helmer@sdnet.bitnet, ghelmer@dsuvax.uucp, uunet!loft386!dsuvax!ghelmer
wkt@csadfa.cs.adfa.oz.au (09/27/90)
In article <1990Sep25.221053.23430@dsuvax.uucp>, Guy Helmer writes: >This morning, though, I rebuilt everything after changing the number >of buffers in the cache to 30, and 386 Minix started without any complaints. >I'd like to find the source of the trouble when using large numbers >of buffers and the plain Minix bootstrap. Bruce Evans wrote to me a few weeks ago explaining the problem. It seems that when build is compiled as a 16-bit binary it can't cope with sizes bigger than 64K. Specifying a buffer size bigger than this gives the problem you've see. Solution: Build a '386 kernel with 30 buffers. Use it to make a 32-bit build, and use _it_ to build a kernel with, say, 300 buffers. This works! BTW, you get an awfully large image, around 500K, with this. This is mostly empty space. Has anybody thought of a method of removing the empty (uninitialised data) part of the image, and to create this at boot time?! Cheers, Warren Toomey wkt@csadfa.cs.adfa.oz.au
evans@syd.dit.CSIRO.AU (Bruce.Evans) (10/03/90)
In article <1990Sep22.055445.15470@dsuvax.uucp> ghelmer@dsuvax.uucp (Guy Helmer) writes: >I just built the 386 version of MINIX, and I've been searching for >the source of this error for several hours now. I receive a >general protection from process number 1, pc = 0x0007:0x00000385 >and the friendly "Kernel panic: exception in kernel, mm, or fs" >message immediately after pressing the '=' key at the boot menu. >I'm trying to run this without shoelace or db. Warren Toomey has already answered this. This reply has been delayed by news timewarp. The problem is that the 16-bit "build" messes up the 32-bit image with the default number of buffers, because it uses unsigned to hold various sizes, and silently truncates the long bss size in fs's exec header. The 32-bit build seems to work OK but I still recommend shoelace. The main other problem with building Minix-386 (again using the new build) was that the makefile didn't cover "db". Add this: --- AS86 =as -0 -a LD86 =ld -0 /etc/db: db.s $(AS86) -o db.o db.s $(LD86) -o /etc/db db.o rm db.o --- >For the gurus that have built 386 kernels that work, am I right >in believing that the code segment of the above address (0x0007) >is the segment descriptor for the BIOS code segment that the No. 0x0007 = 0000 0000 0000 0 1 11 binary. ssss ssss ssss s t pp Where 's' = descriptor table index (= 0), 't' = LDT/GDT table selector (= 1 for LDT), 'p' = privilege level (= 3 for servers and users/0). The index is never 0 for GDT entries so we can tell this is an LDT selector without looking at the table selector bit. For servers and users, the CS selector is always 0x0007 and the DS selector is always 0x000F so very little can be decided from the selector alone. The next step in debugging is to look at the processes number: >general protection from process number 1, pc = 0x0007:0x00000385 Assuming the kernel itself is working, this says that the faulty task is FS and something bad happens at address 0x385 in the code segment. The next step is to disassemble around address 0x385 using db after stopping the boot at a convenient breakpoint, or using mdb on the fs binary on a working system. Either the instruction or the data it references it could be wrong. In this case I think the trap is caused by a data reference beyond the end of the data segment. Build has caused the data segment to be too small. -- Bruce Evans evans@syd.dit.csiro.au