[comp.unix.xenix] Xenix 386 v2.3 bug?

tyager@maxx.UUCP (Tom Yager) (11/10/88)

I just finished installing version Xenix version 2.3 recently, and (more or
less) immediately stumbled over two apparent bugs:

1. System panics trying to format a floppy

When trying to format a 96TPI 5.25" floppy in drive "A", the system takes a
general protection panic before it even accesses the drive. Reads (tar,
dosdir, etc.) seem to work okay.

Nuts!

2. return(value) from main() doesn't work the same as exit(value)

This simple program:

main()
{
   return(0);
}

passes a value other than 0 to its parent. Replacing 'return' with 'exit'
makes it work properly.

BTW, I found #2 while re-installing smail2.5. Incoming uucp mail was getting
delivered but the uucp/uuxqt log was showing "command failed with status 2"
messages. This resulted in a mail message getting shipped to the sender
stating that their mail didn't get delivered.

I installed this new version primarily to get HDB uucp, which I like a lot. I
was a little surprised at the format bug (that one would seem unlikely to get
through simple testing, eh?), but I'm not THAT upset--it is new, and the
problems may just be on my machine.

If anyone from SCO is listening, I'm running a Compaq '386, 2MB memory, 100MB
hard disk (70+30), 2 serial, 2 parallel, 1.2MB floppy, 40MB Irwin tape. I'll
be glad to make it crash again & copy the pertinent info down if I get an
email request.

Anyone else seeing these things?

Thanks, all.
(ty)

-- 
Tom Yager, Apollo Computer
ARPA: tyager%maxx@m2c.m2c.org (preferred) -or- tyager@apollo.com 
-- I speak only for (and to) myself --
"I like life; it's something to do."

debra@alice.UUCP (Paul De Bra) (11/11/88)

In article <40@maxx.UUCP> tyager@maxx.UUCP (Tom Yager) writes:
>...
>2. return(value) from main() doesn't work the same as exit(value)

According to the new K&R book return and exit should be equivalent in
ANSI C. However, I don't think this used to be the case before.

return is a C statement which causes main to return a value to its caller,
and this may prevent some necessary cleanup in older implementations.
exit is a library function which does the necessary cleanup and then
terminates the process.

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

richard@neabbs.UUCP (RICHARD RONTELTAP) (11/11/88)

[ return(x) doesn't work in main, exit(x) does ]
 
It is little consolation, but this is a documented 'bug'. I read it in
the relase notes of development system 2.2.x, I think.
 
Richard
(...!mcvax!neabbs!richard)

jbayer@ispi.UUCP (id for use with uunet/usenet) (11/12/88)

In article <40@maxx.UUCP>, tyager@maxx.UUCP (Tom Yager) writes:
. I just finished installing version Xenix version 2.3 recently, and (more or
. less) immediately stumbled over two apparent bugs:
. 
. 1. System panics trying to format a floppy
. 
. When trying to format a 96TPI 5.25" floppy in drive "A", the system takes a
. general protection panic before it even accesses the drive. Reads (tar,
. dosdir, etc.) seem to work okay.
. 
. 
	(other stuff deleted)
. 
. I installed this new version primarily to get HDB uucp, which I like a lot. I
. was a little surprised at the format bug (that one would seem unlikely to get
. through simple testing, eh?), but I'm not THAT upset--it is new, and the
. problems may just be on my machine.
. 


According to the documention, drive A is mapped to /dev/install, which is NOT
suitable for formatting.  They recommend using /dev/rfd096 or /dev/rfd048
to format.  (/dev/install cycles through different disk densities looking
for a match)

Jonathan Bayer
Intelligent Software Products, Inc.

egon@impch.UUCP (Lukas Knobloch) (12/14/88)

Hiya

From time to time we've got problems with our System it means:

panic:Memory failure - parity error

Hit any......

does anybody know a prevention against this???????? :-(
i'll think it's the EGA but i'm a bit unsure

please reply


sincerly :-)

lukas

debra@alice.UUCP (Paul De Bra) (12/17/88)

In article <401@impch.UUCP> egon@impch.UUCP (Lukas Knobloch) writes:
>
>Hiya
>
>From time to time we've got problems with our System it means:
>
>panic:Memory failure - parity error
>
>Hit any......
>
>does anybody know a prevention against this???????? :-(
>i'll think it's the EGA but i'm a bit unsure

You don't give much info on your system, but let me try to give some
general advice, cause I've had to tackle this problem several times before.

The most likely cause is that your memory chips are not quite fast enough.
If you have a "cheap" clone the memory is usually barely fast enough and
sometimes fails. It often is a matter of just that one nanosecond too slow
to perform reliably. (It need not be just the memory chips, could be the
supporting hardware as well.)

Other possibility is that some periferal board is not up to your bus speed.

I have been able to solve several of these problems by changing the positions
of the boards. On a generic AT the rightmost slots (towards the power supply)
give you a bit more flexibility than the leftmost slots. Usually a display
adapter is put in the leftmost slot. Moving it as far to the right as possible
can help. Also, if you have memory boards (you may not have any on the 16-bit
bus since you mention 80386) the EGA-card may not like to be far away from the
memory boards. I found one instance where a system with 2 memory boards would
only function if the EGA-card was installed right between the 2 memory boards.
(Putting the boards in adjacent slots and the EGA-card next to them didn't
help. The EGA had to sit between the memory cards.)

Maybe you may think all of this is nonsense. I can assure you it is not.
I recently installed a second memory board in an 8 Mhz AT (with 1 wait state).
Both memory boards had 120ns chips, so plenty fast. Well, the startup memory
test would not even pass unless I moved the memory boards as far to the
right as possible, and the EGA next to them.

Hope this helps. Start shuffling boards!

Paul.
not 
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

tuck@iris.ucdavis.edu (Devon Tuck) (12/19/88)

Regarding the memory parity error,  we had the same problem.  None of the
standard tests we ran gave any indication of problems.  I talked to the
techs at SCO and they said it didn't matter what our tests said because
Xenix uses the memory boards differently.  Finally we just replaced our
board with a new one and it worked fine.  I am not sure it was possible
for us to swap slots because it was the extended memory which was messing
up, and all of the other software counted on the board being in a certain
slot.  Anyway, everything works fine now, but I hope you don't have to
replace your board!

Devon Tuck

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (12/20/88)

In article <401@impch.UUCP> egon@impch.UUCP (Lukas Knobloch) writes:
| 
| Hiya
| 
| From time to time we've got problems with our System it means:
| 
| panic:Memory failure - parity error

  I may be able to help. I had a similar problem, and it only occurred
when I was doing floppy disk i/o, therefore no memory test could (or
would) find it.

  After some extensive detective work it turned out the memory failed
only when the CPU and DMA channel were both hitting the same memory. I
started swapping memory, to no avail. If I took out any 1/2MB it went
away, but replacing the memory brought it back.

  I finally found that I had a slow support chip on the memory board,
and when I put more than 1.5MB on it, it failed. Driving the extra load
of more chips, the card just couldn't keep up. At 2.5 it was really
unreliable. After trying to find the chip, I decided to just put the
extra memory on another board and forget it. So far everything is
running just fine, although I will probably go to faster chips when the
prices come down.

  If the problem is not associated with floppy i/o you have some other
problem, and the memory could be at fault as well as the support chips,
but if the memory test runs and the o/s doesn't, it's usually a slow
chip, either memory or support. Hope this helps you (or somebody), the
manufacturer was ready to swear that the problem was in Xenix.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

debra@alice.UUCP (Paul De Bra) (12/20/88)

In article <12837@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
|In article <401@impch.UUCP> egon@impch.UUCP (Lukas Knobloch) writes:
|| 
|| Hiya
|| 
|| From time to time we've got problems with our System it means:
|| 
|| panic:Memory failure - parity error
|
|...
|  I finally found that I had a slow support chip on the memory board,
|and when I put more than 1.5MB on it, it failed. Driving the extra load
|of more chips, the card just couldn't keep up. At 2.5 it was really
|unreliable...

I have seen the same problem: I have 2 2.5MB memory boards in my AT, same
brand and model, but one newer than the other. If I try to beef up the CPU
to 12 Mhz the newer board still works and the other isn't even found by the
startup memory test. Memory chips have the same speed, but the newer board
has faster support chips. The board layout is still identical and so are the
numbers on the chips, but they are of a different type on the newer board,
indicated by the letter. (like 74F... or 74HC... or 74LS...)

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

fyl@ssc.UUCP (Phil Hughes) (12/21/88)

In article <3393@ucdavis.ucdavis.edu>, tuck@iris.ucdavis.edu (Devon Tuck) writes:
> Regarding the memory parity error,  we had the same problem.  None of the
> standard tests we ran gave any indication of problems. 

I have had this happen on all sorts of flavors of UNIX on AT buss
machines.  Boards would work with DOS but fail with XENIX or V/AT.
After watching this happen I came to the conclusion that the failures
only happened while disk I/O was in progress.  Although I haven't gone
to the logic analyzer level, this seems to fit as memory diagnostics
only test memory and disk diagnostics only test disks.  Therefore,
the failure would not show up with the tests.

I considered writing a real test that would cause the failure but
it is easier to buy memory boards that actually work right and let
the DOS users have the others.

-- 
Phil Hughes, SSC, Inc. P.O. Box 55549, Seattle, WA 98155  (206)FOR-UNIX
    uw-beaver!tikal!ssc!fyl or uunet!pilchuck!ssc!fyl or attmail!ssc!fyl

det@hawkmoon.MN.ORG (Derek E. Terveer) (01/05/89)

In article <3393@ucdavis.ucdavis.edu>, tuck@iris.ucdavis.edu (Devon Tuck) writes:
> Regarding the memory parity error,  we had the same problem.  None of the
> standard tests we ran gave any indication of problems.  [..]

When i installed my intel above board 286/ps into my 386 machine, i had a bad
memory chip on it (unknown to me at the time, of course (:-(..).  I ran the
intel diagnostics (testab.exe) on the silly thing for >100 times w/o a single
error being detected.  

I later talked to an intel tech who admitted that the intel diagnostics were
(i'm paraphrasing slightly here:) "rather lame."  He admitted that the
diagnostics did not really test extended memory!!  I guess 'cause dos doesn't
use it very well (the extended memory, that is) that memory wasn't really
required to be tested.

I also talked to my wholesaler's tech, who stated that he had seen (memory) 
chips completely removed from the board and the diagnostic still ran w/o one
error!!!!!

Thats amazing.

derek
-- 
Derek Terveer 	    det@hawkmoon.MN.ORG || ..!uunet!rosevax!elric!hawkmoon!det
		    w(612)681-6986   h(612)688-0667

"A proper king is crowned" -- Thomas B. Costain