[comp.os.os9] Diffs between OS-9 2.2 / 2.4 ?

pss1@kepler.unh.edu (Paul S Secinaro) (06/27/91)

     Recently, we have been having trouble with a piece of software
which we are evaluating.  When I run the program on our system, which
is running OS-9/68020 v2.2, I get an error #000:102, which the manual
states is a "Bus exception error" or something like that (it does not
go into any more detail than that).  This same binary is known to run
properly on a system running OS-9 v2.4 on a 68030 module.

So, my questions boil down to:

1.  What exactly does error #000:102 mean?  I know that UNIX, for
example, will give a "Bus Error - core dumped" message if you try to
use an improperly initialized pointer or otherwise bang up against the
limits imposed by the MMU.  Does it mean the same thing in OS-9?  Or
is it referring to some sort of hard error on the processor bus?

2.  Could this type of error be caused by lack of upward compatibility
from v2.2 => v2.4 of OS-9.

3.  Could this program be trying to execute '030 specific code on the
'020?  Can the Microware C compilers generate '030 specific code?

    I realize that it's impossible for anyone to fully answer any of
these questions without knowing a lot about the code itself.  I'm just
interested in knowing if 1,2, and 3 are possible sources of trouble.
Is there any way to setup DEBUG to go active on an error trap, or does
it do this automatically (I've never used it much before).  Thanks

Paul
-- 
Paul Secinaro             | Synthetic Vision and Pattern Analysis Laboratory
pss1@kepler.unh.edu       | Department of Computer and Electrical Engineering
p_secinaro@unhh.unh.edu   | University of New Hampshire     (603) 862-3287

tony@mwuk.UUCP (Tony Mountifield) (06/28/91)

In article <1991Jun27.140054.13331@unhd.unh.edu> pss1@kepler.unh.edu (Paul S Secinaro) writes:
> 1.  What exactly does error #000:102 mean?  I know that UNIX, for
> example, will give a "Bus Error - core dumped" message if you try to
> use an improperly initialized pointer or otherwise bang up against the
> limits imposed by the MMU.  Does it mean the same thing in OS-9?  Or
> is it referring to some sort of hard error on the processor bus?

It means what you thought - an access through an uninitialized or
corrupt pointer. On systems with an MMU and the "ssm" module, this means
almost anything outside your own program and data space (or modules you
have linked to). Without an MMU, it will only happen with references to
non-existent physical addresses. If a user-state program gets a bus
error, the kernel catches it, and terminates the process cleanly with an
error 102.

> 2.  Could this type of error be caused by lack of upward compatibility
> from v2.2 => v2.4 of OS-9.

Quite likely. If a program is compiled with the compiler and libraries
from a later system, it will likely not run on an earlier system. The
other way round is OK though - I have binaries compiled for V2.1, and
they run fine on V2.2, V2.3 and V2.4.

> 3.  Could this program be trying to execute '030 specific code on the
> '020?  Can the Microware C compilers generate '030 specific code?

No. In fact the difference between an 020 and an 030 is only in the area
of the MMU. Our compiler never has reason to generate MMU instructions.

>     I realize that it's impossible for anyone to fully answer any of
> these questions without knowing a lot about the code itself.  I'm just
> interested in knowing if 1,2, and 3 are possible sources of trouble.
> Is there any way to setup DEBUG to go active on an error trap, or does
> it do this automatically (I've never used it much before).  Thanks

If you run a program under DEBUG (OS-9 user-state debugger) it will be
informed by the kernel if the program gets a bus error, and you will be
able to examine the memory, registers, etc. To do this on a program
called "test", do:

	debug test
	x -1

This will execute at full speed (the "g" command uses Trace mode, and is
much slower), and will stop at an exception such as bus error.

> Paul
> -- 
> Paul Secinaro             | Synthetic Vision and Pattern Analysis Laboratory
> pss1@kepler.unh.edu       | Department of Computer and Electrical Engineering
> p_secinaro@unhh.unh.edu   | University of New Hampshire     (603) 862-3287

Tony.
-- 
Tony Mountifield.                | Microware Systems (UK) Ltd.
MAIL:  tony@mwuk.uucp            | Leylands Farm, Nobs Crook,
INET:  tony%mwuk.uucp@ukc.ac.uk  | Colden Common, WINCHESTER, SO21 1TH.
UUCP:  ...!mcsun!ukc!mwuk!tony   | Tel: 0703 601990   Fax: 0703 601991
**** OS-9, OS-9000 Real Time Systems **** MS-DOS - just say "No!" ****

bcwhite@crocus.waterloo.edu () (06/28/91)

I don't know exectly why you would get a bus error on a 68020 but not
on an 030, unless you are trying to access things like the 030's on
board MMU.  Other that those function differences, the instruction sets
of the 68020-040 are all identical (MMU excepted).   Even the FP unit
codes are the same.

                                        Brian

---------------------------------------------------------------------
|  Internet   --  bcwhite@electrical.watstar.waterloo.edu (or .ca)  |
|  Delphi     --  BrianWhite                                        |
---------------------------------------------------------------------
"He who laughs last usually made a backup."

knudsen@cbnewsd.att.com (michael.j.knudsen) (06/29/91)

In article <1991Jun27.140054.13331@unhd.unh.edu>, pss1@kepler.unh.edu (Paul S Secinaro) writes:

>      Recently, we have been having trouble with a piece of software
> which we are evaluating.  When I run the program on our system, which
> is running OS-9/68020 v2.2, I get an error #000:102, which the manual
> states is a "Bus exception error" or something like that (it does not
> go into any more detail than that).  This same binary is known to run
> properly on a system running OS-9 v2.4 on a 68030 module.

Yes, OSK error messages are not known for detail, tho trivial
errors like #216 "File Not Found" give out tons of canned formal
verbiage :-)

> 1.  What exactly does error #000:102 mean?  I know that UNIX, for
> example, will give a "Bus Error - core dumped" message if you try to
> use an improperly initialized pointer or otherwise bang up against the
> limits imposed by the MMU.  Does it mean the same thing in OS-9?  Or

Yes it's the same idea -- in our case, Odd Address Error.
Having just ported a 6809 program to a 68000 clone, I think I can
answer you correctly.

> 3.  Could this program be trying to execute '030 specific code on the
> '020?  Can the Microware C compilers generate '030 specific code?

Your problem is that the code is trying to access a word (long or
short) on an odd byte address.  Since the 68000 chip is 16-bit
shortword hardware, this is a no-no and causes a hardware bus
interrupt and the error you got.

The 68030 (and 020) threw in extra silicon to shift data up or down
a byte as needed, so odd-byte accesses are now legal on those
advanced chips.  They are slower, tho, requiring an extra bus cycle.

So code that runs on an 020 or 030 may break on the 000 thru 012.
Note that the OSK version is not the problem.

I got into trouble by casting char pointers into short or int
pointers.  Since the char pointers can be odd addresses, half the
time that will get you killed!
So check the source for (int *) and (short *) and (long *) applied
to char pointers.  Then rewrite the code (a shame, it may be
inelegant when you get finished).

If you *don't* find the above in the source, then maybe it's a
genuine buggy pointer that only the old 68000 can catch.
Hope you find it...

In my own case, I was casting a char field inside a structure that way.
I re-ordered the fields within the structure to make the byte in
question land on an even boundary.  Fixed the problem.

But the moral is that some forms of pointer type-casting are NOT
portable, even within the Moto family.
-- 
"What America needs is A Thousand Points When Lit..."
	knudsen@ihlpl.att.com