[comp.os.minix] Problems compiling kernel; problems with build

wjc@eddie.MIT.EDU (Bill Chiarchiaro) (05/04/87)

Please excuse the length of this posting, but I wanted to provide
as thorough a description of my problem as possible.

I am having problems compiling the MINX kernel and building good
boot diskettes.  My machine is a Leading Edge MP-1673 with 640K
and two 320KB floppy drives.  I also have the Leading Edge
Monochrome Display/Parallel printer adapter, a Standard Brands
Multifunction card, and a Standard Brands FlashCard 30 (MiniScribe
8425 Winchester drive with Western Digital WX1002-27X RLL controller).
The 8088's clock rate is switchable to either 4.77 or 7.16 MHz.

When I try to compile the kernel and build a boot diskette, I get
one of three results:
		1) A working boot diskette.
		2) A fatal error during the "make" of the kernel.
		3) No errors from make, cc, or build, but a
			boot diskette that doesn't boot.
Out of the first 8 times I tried to make a boot diskette, only 1 was
successful.  I then realized that I wasn't sure which CPU speed I
had been using; I had been changing it for other purposes.  I dropped
the speed to 4.77 MHz and my next 3 attempts were all successful.
However, my last 4 attempts have all failed.

Of the 4 successful attempts, 2 started with a make from scratch of the
sources in the kernel directory, and the other 2 consisted of compiling
only one source file (printer.c or wini.c) and asld'ing with the
other previously-compilied files.  The build's have always used the existing
object files in the tools directory, except for my new kernel object files.

The error mentioned in item 2 above is of the form:
	make
		.
		.
		cc -S -Di8088 -w -F -T. floppy.c
		Unrecoverable disk error on device 2/0, block 330
		/usr/lib/opt: error on line 825(%*s): unknown instruction byte
	Make Error code 1

	***Stop.

Note that the error does not necessarily occur during the compiliation
of floppy.c or during the opt phase of the compiler; these are just
examples.  Also, the error can occur on device 2/1 and at different block
numbers.  I also tried switching the diskettes between the two
drives and have still seen the error.  After getting the error, I can
cat every file on the kernel diskette without any problems.

The problem mentioned in item 3 goes as follows:  I go through a make (or
single cc and asld) of the kernel, move the new kernel to the tools
diskette, and do a build.  No errors have occured up to this point.
However, when I try to boot the new boot diskette, I get the following
message:
		Booting MINIX 1.1
		Read Error.  Automatic reboot.
The machine then sits in a loop repeating the message.

I have had no problems with the machine under MS-DOS 2.11 or 3.10.  Under
MINIX, the only other problems I have had have been the expected ones
with the hard disk and parallel printer, and also that the shell sometimes
does not return a prompt after completing a command (hitting the interrupt
character wakes it up).

Can anyone give me any advice?

Thanks,
Bill
N1CPK
wjc@eddie.mit.edu

ast@cs.vu.nl (Andy Tanenbaum) (05/05/87)

In article <5692@eddie.MIT.EDU> wjc@eddie.MIT.EDU (Bill Chiarchiaro) writes:
>
> [Lots of problems described] Can anyone give me any advice?
>
A couple of things come to mind.  First, you said you were using 320K
diskettes.  If you really were, that would explain some things.  MINIX
expects 360K diskettes.    Check that carefully.

The error that intrigues me the most is the automatic reboot loop when
trying to boot a newly built diskette.  That loop is generated by the
bootstrap program (tools/bootblok.s).  The bootstrap reads in the
operating system using the BIOS, and it gets an error, prints that
message and tries again, forever.  Thus the BIOS is returning errors
when reading the operating system.  My guess is that either you
really do have a 320K diskette or you have a 360K diskette with bad spots
on it.  In any event, I would take a new diskette, format it for 360K
and try that one.  If the problems go away, it is clearly due to a rotten
diskette.

The next thing that I suspect is a controller that has different timing
than the IBM's.  The greater success at 4.77 MHz is an indication.  You
probably need to handshake the controller.  A fix doing just that
was posted a few weeks ago.  If you are getting floppy disk errors or
hard disk errors, that could explain everything.

Another thing worth doing is writing a little test program in C that
opens a file, seeks to a random block, writes a known pattern (e.g.
512 times the block number), and then repeats this 100 times to purge
the buffer cache.  Then read them all back and verify all the bytes.
Run this program for an hour simultaneously on the hard disk and the
floppy.  If it reports errors, the problem is the driver and controller
and not matched up.  Incidentally, if anyone has such a disk test
program, please post it.

Andy Tanenbaum (ast@cs.vu.nl)

wjc@mit-eddie.UUCP (05/08/87)

Sorry for the mistake in my original posting.  I have 360 KB (not 320 KB)
diskette drives.  In fact, the controller uses the NEC 765 which MINIX
expects.

I have tried the compilation / build about 15 times now and have been using
a number of brand-new (except for the MS-DOS formatting) 3M DS/DD diskettes.
Maybe my diskette drive heads need to be cleaned, but I haven't seen this
problem occur under MS-DOS or anywhere else under MINIX.  Is it that cc and
build do such an immense amount of disk I/O that the problem gets a greater
opportunity to manifest itself?

It's going to be a real trick to get the hard disk running when I can't
reliably rebuild the kernel off of floppies.

Thanks again,
Bill