[comp.os.minix] Interrupt problems

pcm@iwarpo3.intel.com (Phil C. Miller) (02/13/88)

In the seemingly never-ending process of trying to whip my AT-clone
into shape long enough to rebuild the Minix kernel, I have run into a
fairly troublesome problem.

When I attempt a 'make', minix crunches away for 10-15 seconds, then
starts giving me an endless stream of unexpected interrupt messages.

Occasionally, these problems go away long enough to get something done,
but there seems to be no reliable way to get rid of them.  As a
stop-gap measure, I can power down my PC, rip out the extended memory
cards and my bus mouse, and try again.  If any of those are causing the
problem, the interrupt errors will at least go away long enough to
finish building the v1.2 kernel.

However, as soon as I plug my toys back in, the problem will come back
(again assuming it's a hardware problem induced by the toys).

Is there some general mechanism for dealing with such problems?  My
suggestion would be a program which printed warning messages for the
first one or two interrupts, then "ignored" subsequent messages.

By "ignoring" I basically mean anything which suppresses 100's of error
messages flying by at 600kbaud.  Perhaps a program which (a) disabled
the offending interrupts after a few token warnings, or (b) tacitly
handled the required Minix interrupt protocol and returned control to
the interrupted process without action (the degradation in performance
would be much better than no performance at all).

Any suggestions, oh netlanders?  I will post interesting results of a 
non-pornographic nature.

Incidentally, I am running on an Intelligent Data Systems PC-286 Turbo
AT with a Western Digital controller, a CDC 42mB/28mS winchester, and
it is not inherently friendly toward Minix.  Have had hard disk problems
of various kinds, hercules display problems, and an occasional floppy
problem.  My printer, a Panasonic 1080i, also gives me fits with Minix.
It is obviously imperative that I get to version 1.2; can't do much with
v1.1.

This brings up another interesting point: is it possible (legal) for 
someone to e-mail, snail-mail, or post a copy of the kernel?  I really
need a copy of v1.2 and I'm just not getting anywhere after 8 months of
grief with my PC choking on EVERYTHING Minix tries to do.

Thanks for WHATEVER help I get.

Phil Miller

Leisner.Henr@xerox.com (marty) (02/14/88)

Phil,

Minix boots without all your interrupts unmasked (on your 8259 PICs).

It is entirely probable your "junk" is asserting interrupts after boot.

I recall the same problems when I started playing with Minix on genuwine IBM-PC
ATs.  I think at the time I plugged my hardware interrupts with a dummy
interrupt handler.  

Analyze what device you have drivers for and only enable those interrupts.  The
problem outta go away.


marty
ARPA:	leisner.henr@xerox.com
GV:  leisner.henr
NS:  martin leisner:henr801c:xerox

Leisner.Henr@xerox.com (marty) (02/14/88)

Phil,

Minix boots with all your interrupts unmasked (on your 8259 PICs).

It is entirely probable your "junk" is asserting interrupts after boot.

I recall the same problems when I started playing with Minix on genuwine IBM-PC
ATs.  I think at the time I plugged my hardware interrupts with a dummy
interrupt handler.  

Analyze what device you have drivers for and only enable those interrupts.  The
problem outta go away.


marty
ARPA:	leisner.henr@xerox.com
GV:  leisner.henr
NS:  martin leisner:henr801c:xerox

pcm@iwarp.intel.com (Phil C. Miller) (02/18/88)

Thanks for the input, Marty.  I will check it out.  For the moment, I will
be rebuilding Minix sans memory cards and Mouse (none of which would do me
any good under Minix anyway).  After I get Minix rebuilt, I will try to 
mask the aforementioned !@#$%^&*() interrupts.

Phil Miller

nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (04/24/89)

My copy of Minix 1.3 just arrived from P-H and I am trying to get it 
working.  Most everything seems to run fine, except the hard disk
driver.  Snooping around indicates that the 5100 has a disk controller
that's too clever by half.  In particular, I have reason to believe
that the controller caches an entire track when asked for a sector.

When setting up the WINCHESTER task at boot time, Minix reads things
off the disk (in particular, the first thing read is the partition table).
Because the disk has 512 byte sectors, and Minix uses 1Kb blocks, the
driver asks for two sectors from the beginning of the disk.  Then the
following occurs, in quick succession:

	- an interrupt is received from the controller
	- the routine "interrupt" starts running; the first thing it
	  does is reenable the 8259 PIC
	- the controller interrupts again(!)
	- interrupt processing starts all over again, saving the
	  second message and setting the appropriate bit in the
	  bitmap saying a message is pending for WINCHESTER
	- interrupt processing resumes on the first interrupt,
	  saving the first message over the second and setting
	  the WINCHESTER bit again
	- WINCHESTER finally does a RECEIVE(HARDWARE,..)
	- "interrupt" sends WINCHESTER a message successfully and
	  clears its message pending bit
	- WINCHESTER looks for the second sector by doing another
	  RECEIVE(HARDWARE,..) and goes off to never never land

I have fixed this by changing w_transfer in WINCHESTER to do a series
of single sector reads to fill out a block, which works.  Performance
seems somewhat awful however.

Now the question:  Should I be patching WINCHESTER for this?  Arguments
in favor include localizing the changes to one routine, but I have a
fundamental objection to the fact that the interrupt handler can lose
interrupts in this way.  I will admit that my machine may be unusual, but
it seems that similar problems would occur in a machine with two or more
hard disks, since one could easily have multiple requests pending on 
multiple disks.  My gut feeling is that proc.c should be rewritten to
handle this.  As this is somewhat philosophical, perhaps ast could comment.

In addition, I would be happy to receive suggestions on improving the
hard disk performance I am seeing.  Does 1.4 address either the correctness
or performance issues?

Norbert