[comp.lang.c] Borland Turbo C 2.0 for Atari 68000 machines: ODD behavior

carter@cs.wisc.edu (Gregory Carter) (04/06/91)

I recently had a problem with a compiler for my MEGA STE 4/50.  I was
wondering if any of you can report the same problems with Borland's
compiler?

When I attempt to move 0x03 into the address 0x00ff8e20 I get the
following:

 (unsigned int)(*((unsigned int *)0x00ff8e20L)) = 0x03;
 which translates to:
                       MOVE.W #$0003, $00ff8e20

 Ok, I expect that, thats no problem....BUT when I do this:

 (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
 which translates to:
                       MOVE.W #$0003, $8e20

This is obviously not correct.

I thought about the compiler cutting 8 bits off the top half of the
address since the 68000 doesn't use this space anyway.  But, this isn't
what I expected.  Anyone know why they (BORLAND) decided to do address
coloring in this fashion?

Its a curiosity right now, but when I was going through my profs unclear,
nonportable, nonfunctional class examples, IT TICKED ME OFF. Made life
generally !pleasant.

Any help greatly appreciated.

Greg Carter
[Looks like a bug to me. -John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

ckp@grebyn.com (Checkpoint Technologies) (04/08/91)

In article <1991Apr6.091013.26131@daffy.cs.wisc.edu> carter@cs.wisc.edu (Gregory Carter) writes:
> (unsigned int)(*((unsigned int *)0x00ff8e20L)) = 0x03;
>                       MOVE.W #$0003, $00ff8e20
>...
> (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>                       MOVE.W #$0003, $8e20
>
>This is obviously not correct.

This may not be obvious, but it *is* correct.  The compiler is doing you a
favor.

The 68K has "absolute short" and "absolute long" addressing.  Absolute
long means that a whole 32 bit absoluet address follows.

Absolute short means that only a 16 bit word follows, and it should be
*sign extended* to 32 bits before being used.

The value $8E20, when sign extended into a full 32 bit address becomes
$FFFF8E20.  This is just what you asked for, and the compiler found a
briefer way to code it.
-- 
ckp@grebyn.com
[brychcy@informatik.tu-muenchen.dbp.de (Till Brychcy) also pointed this out.
-John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

albaugh@dms.UUCP (Mike Albaugh) (04/09/91)

	I was originally going to say the same thing, but stopped when I
actually read the question:

>From article <1991Apr8.133307.27870@grebyn.com>, by ckp@grebyn.com (Checkpoint Technologies):
> In article <1991Apr6.091013.26131@daffy.cs.wisc.edu> carter@cs.wisc.edu (Gregory Carter) writes:
>> (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>>                       MOVE.W #$0003, $8e20
>>
>>This is obviously not correct.
> 
> This may not be obvious, but it *is* correct.  The compiler is doing you a
> favor.
> [...]
> The value $8E20, when sign extended into a full 32 bit address becomes
> $FFFF8E20.  This is just what you asked for, and the compiler found a
> briefer way to code it.

	Such favors I don't need. Assuming that Greg correctly transcribed
the assembly output (rather than making a mistake dissasembling actual
machine code) the "correct" outcome from the above requires an assembler
that has some non-intuitive way of marking "I want a 32-bit address here"
and _defaults_ to using short addressess. This is definitely not the
way any of the five 68000 assemblers I have used work, and seems to me to
violate the "principle of least surprise". Now, if the assembler had
taken
	MOVE.W #$0003, $0ffff8e20

and generated a 16-bit address (as all the ones I've used do) I would
accept it as a "favor". But If I have to code

	MOVE.W	#$0003, $8e20.YES_I_REALLY_MEAN_IT

to access the equally legitimate address $00008e20, I don't consider that
a "favor". My personal suspicion is that either Greg (or one of his
software tools) mis-disassembled some machine code, or the compiler
was intended to emit a variety of 68000 assembly syntaces, and the
marker for "use short address" was accidentally omitted in his "flavor".
The second scenario also postulates an intended assembler that is _not_
smart enough to figure it out for itself, so I am dubious...

					Mike

Mike Albaugh (albaugh@dms.UUCP || {...decwrl!pyramid!}weitek!dms!albaugh)
[I've sent followups to comp.sys.m68k, since this seems to be more related
to details of 68K assembler syntax than any compiler issue. -John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

pardo@june.cs.washington.edu (David Keppel) (04/13/91)

carter@cs.wisc.edu (Gregory Carter) writes:
> (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>translates to:
>                       MOVE.W #$0003, $8e20
>This is obviously not correct.

I'll agree that it isn't what I *expected* but it is *correct*.
Here are some other correct implementations:

 * Compiler refuses to compile the program
 * Program aborts when executed
 * Program runs `rogue'
 * Program assigs 3 to memory location 0xffff8e20

Remember, dereferencing a hard-coded address (in C) has
implementation-defined effect.

In the meanwhile, I agree that the one it chose is non-intuitive.

Followups to `comp.lang.c' and `comp.sys.m68k'.

	;-D on  ( Paid-up in tuition )  Pardo
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

boyne@hplvec.LVLD.HP.COM (Art Boyne) (04/15/91)

>In comp.lang.c, pardo@june.cs.washington.edu (David Keppel) writes:
>
>    carter@cs.wisc.edu (Gregory Carter) writes:
>    > (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>    >translates to:
>    >                       MOVE.W #$0003, $8e20
>    >This is obviously not correct.
>
>    I'll agree that it isn't what I *expected* but it is *correct*.
>    Here are some other correct implementations:
>
>     * Compiler refuses to compile the program
>     * Program aborts when executed
>     * Program runs `rogue'
>     * Program assigs 3 to memory location 0xffff8e20
>
>    Remember, dereferencing a hard-coded address (in C) has
>    implementation-defined effect.
>
>    In the meanwhile, I agree that the one it chose is non-intuitive.

You don't seem to understand:  MOVE.W #$0003, $8e20, assuming that the
compiler generated the 3-byte 68000 opcode [0x31FC,0x0003,0x8e20]
(immediate operand to absolute short address), will indeed assign 3 to
location 0xffff8e20, *exactly* the desired result.  This is because the
68000 will *sign-extend* a 16-bit absolute short address to 32 bits.

To assign 3 to location 0x00008e20, absolute long (32-bit) addressing must
be used to avoid the sign extension.  This is the 4-byte opcode 
[0x33FC,0x0003,0x0000, 0x8e20].

Now, if the compiler generated the absolute long address instruction above
in this case, I would call it a bug.

Art Boyne, boyne@hplvla.hp.com

csbrod@immd4.informatik.uni-erlangen.de (Claus Brod) (04/17/91)

pardo@june.cs.washington.edu (David Keppel) writes:

>carter@cs.wisc.edu (Gregory Carter) writes:
>> (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>>translates to:
>>                       MOVE.W #$0003, $8e20

>In the meanwhile, I agree that the one it chose is non-intuitive.

Maybe not intuitive, but correct and faster than other options.

----------------------------------------------------------------------
Claus Brod, Am Felsenkeller 2,			Things. Take. Time.
D-8772 Marktheidenfeld, West Germany		(Piet Hein)
csbrod@medusa.informatik.uni-erlangen.de
Claus Brod@wue.maus.de
----------------------------------------------------------------------

asd@cbnewsj.att.com (Adam S. Denton) (04/20/91)

In article <690004@hplvec.LVLD.HP.COM> boyne@hplvec.LVLD.HP.COM (Art Boyne) writes:
>>In comp.lang.c, pardo@june.cs.washington.edu (David Keppel) writes:
>>
>>    carter@cs.wisc.edu (Gregory Carter) writes:
>>    > (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
>>    >translates to:
>>    >                       MOVE.W #$0003, $8e20
>>    >This is obviously not correct.
>>
>>    I'll agree that it isn't what I *expected* but it is *correct*.
>>    Here are some other correct implementations:
>>
>>     * Compiler refuses to compile the program
>>     * Program aborts when executed
>>     * Program runs `rogue'
>>     * Program assigs 3 to memory location 0xffff8e20
>
>You don't seem to understand:  MOVE.W #$0003, $8e20, assuming that the

He understands.  This debate is apparently what assigning
to a casted expression should do.  The above is very clearly
    (cast) [something] = [some value];
which has never been defined to have any meaning whatsoever.
Let's not forget what GIGO stands for!

Now, consider rewriting that expression...instead of:
>>    > (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;
Why not toss the (unsigned int) cast?  What on earth is it there for anyway?
Maybe it should be before the 0x03 instead, or left out entirely.

If one removes the cast, *then* the statement has meaning, and ONLY then is
there a platform on which to decide whether the compiler's correct or not.
But as it stands, there's no reason why the statement *shouldn't* run `rogue'!

Adam Denton
asd@mtqua.att.com

torek@elf.ee.lbl.gov (Chris Torek) (04/20/91)

[  (unsigned int)(*((unsigned int *)0xffff8e20L)) = 0x03;  ]

In article <1991Apr19.172024.10364@cbnewsj.att.com> asd@cbnewsj.att.com
(Adam S. Denton) writes:
>This debate is apparently what assigning to a casted expression should do.

Actually, the original debate ignored the illegal source and concentrated
on the questionable (depends-on-sign-extension) object code.

The above is very clearly
>    (cast) [something] = [some value];
>which has never been defined to have any meaning whatsoever.

Actually, it is defined far enough to require a diagnostic from an ANSI
conformant compiler (after which all bets are off).  (This is another
example I would not expect to find in a book on `how to use ANSI C',
and illustrates why such a book is not good for `knowing the ANSI
standard'.  It is also an example of why `knowing the standard' is
often unnecessary: it suffices to know `the result of a cast is a
value, not an object, and may not be modified' without knowing just
what is required of, and allowed of, a compiler when it is handed such
an expression.)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov