[comp.sys.hp] Array indexing problem

schell@llandru.ucdavis.edu (Stephan Schell) (12/22/89)

While attempting to port the dviselect program in the TeX distribution
over to a 9000/835 running hp-ux 3.1, I ran into a problem with array
indexing.  This problem is showcased in the code below:

#include <stdio.h>
char junk[11] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i' , 'j'};
main()
{
	int	i;

	printf("Enter a number between 20 and 29, inclusive: ");
	scanf("%d", &i);
	printf("Character selected is '%c'\n", (junk-20)[i]);
	exit(0);
}

For example, typing "25" in response to the prompt yields a segmentation
fault, even though the final value of (junk-20+i) is (junk+5).
The same behavior was observed when "(junk-20)[i]" was replaced by
"(junk-20+i)".  The program operated properly if "junk[i-20]" was used,
or if a temporary pointer, "junk2 = junk-20+i" was computed and then used.
The common thing among the usages that failed was that they all used
LDBX where the base register contained a value outside the allowable range,
even though the effective (or final) address was OK.  Thus, it seems as if
the segmentation fault is based on the base register contents rather than
the effective address.

Does this behavior seem bogus to anyone except me?

Is there any way to fix this problem without modifying the source code
of each and every application that does this sort of thing?

Thanks.
--
-------------------------------------------------------------------------------
Stephan Schell                          schell@llandru.ucdavis.edu
Dept. of Electrical Engineering         {ucbvax,lll-crg}!ucdavis!llandru!schell
      &  Compter Science				
University of California, Davis         (916) 752-1326

daryl@hpcllla.HP.COM (Daryl Odnert) (12/23/89)

You have indeed discovered a bug in the s800 C compiler.  I will forward
the defect along the the C compiler team to investigate a fix.  If you
are interested in a technical description of what the problem is, read on.

Warning, entering compiler guru mode...

   Because the array named junk is only global variable allocated in this
   program, it is allocated at the base of the global data area.  On the
   s800, the global data area begins at address 0x40000000.

   The HP Precision Architecture is a 64-bit virtually addressed machine.
   The 32-bit address 0x40000000 is implicitly selecting a 32-bit space
   register with the upper two bits of this address.  This space register
   is concatenated with the base register to form the 64-bit virtual address.

   When you subtract 20 bytes from the base address of junk, you get
   0x3FFFFFEC.  In this intermediate result, the upper two bits of the
   address have changed from 01 to 00, so they now specify a different
   space register.

   Unfortunately for us compiler writers, the LDBX (load byte indexed)
   instruction selects the space register for the memory reference *before*
   the base register and the index value are combined to form the effective
   address.  Thus, when the LDBX instruction is executed in your test
   program, we are accessing a virtual space that we don't have access to.
   Ergo, core dump.

Exit compiler guru mode.

Happy Holidays,
   Daryl Odnert
   daryl@hpda.hp.com
   HP California Languages Lab

karl@hpclkwp.HP.COM (Karl Pettis) (12/23/89)

> The common thing among the usages that failed was that they all used
> LDBX where the base register contained a value outside the allowable range,
> even though the effective (or final) address was OK.  Thus, it seems as if
> the segmentation fault is based on the base register contents rather than
> the effective address.

BINGO!  Yes, the segmentation fault IS based on the base register, not
the final effective address.  Your analysis is correct.  When using short
(32-bit) addressing, the top two bits of the base register select which
space will be used.  While the effective address is being computed, the
hardware is simultaneously retrieving the space register value.  The final
address is the concatenation of the space register value and the 32-bit
effective address.  If the effective address has a different top two bits,
the space register selected won't match the desired one and you will get
a segmentation fault.

> Does this behavior seem bogus to anyone except me?

At first glance, it does.  But if the hardware has to wait until the
effective address computation completes before starting the space
register selection, it will introduce a sequential dependency for
something that is done in parallel now.

By the way, this HAS been brought up before.  The biggest problem is
for Fortran and Pascal where the default is non-zero based arrays.
One would like to use the index directly after pre-computing a "base"
address that points to the "zero-th" element.  As you have discovered,
this doesn't work.

For ANSI C, the behavior of your program is undefined.  The ANSI spec
is very clear that intermediate results of pointer arithmetic must
remain within the bounds of an array (or at most pointing just past
the last element) in order for the computation to be defined.  This,
of course, is little comfort for people who have code which breaks.
But it is a justification for the compilers to not generate the extra
instruction for every array index that it would take to guarantee that
the space register is selected based on the effective address.

> Is there any way to fix this problem without modifying the source code
> of each and every application that does this sort of thing?

> Stephan Schell                          schell@llandru.ucdavis.edu

I feel your long range goal should be to root out such code.  But in the
meantime, the work-around I suggest is that you declare an initialized
dummy array at the start of your data area, $PRIVATE$.  You can do this
using the assembler, or with a C routine.  The relocatable with this
array declaration should appear first in your link command (after the
crt0 file).  The size of the dummy array is however big it must be to
ensure that your intermediate pointer calculations do not change the
uppermost two bits of the address.

You can verify that the dummy array is at the beginning of the data area
by using nm on your executable file.

- Karl Pettis
  Telnet 447-5754
  California Language Laboratory
  Hewlett-Packard
  Cupertino, CA

daryl@hpcllla.HP.COM (Daryl Odnert) (12/27/89)

In my previous response, I wrote:
> You have indeed discovered a bug in the s800 C compiler.  I will forward
> the defect along the the C compiler team to investigate a fix.

I guess I spoke a little bit too soon.  It seems to be arguable whether
or not pointer arithmetic outside the bounds of an array is guaranteed
to work in K&R C.  (The ANSI C standard, as Karl Pettis points out, is
very clear on this matter.  Its illegal.)  The C compiler group is reluctant
to fix this problem because it would mean generating extra instructions
for every global array indexing operation.

Please let us know how seriously this problem is impacting your ability
to get your work done.  If this is a major obstacle for you or others who
are reading this discussion, it is possible that some kind of solution
could be developed (e.g. a compiler option).


Daryl Odnert
daryl@hpda.hp.com
Hewlett-Packard California Language Lab

shankar@hpclisp.HP.COM (Shankar Unni) (01/03/90)

> very clear on this matter.  Its illegal.)  The C compiler group is reluctant
> to fix this problem because it would mean generating extra instructions
> for every global array indexing operation.

It's more than that: the construct given in the base note is very
non-portable - for example, there is an excellent chance that this code
will fail on Intel machines: after all, if the data is at the base of a
segment, what happens when you compute an intermediate address that is
outside the segment? Something could well be truncated, and you could end
either crashing your program, or worse still, silently picking up garbage.

It's best to avoid such constructs if you want to keep your programs
portable between widely varying architectures.

If this is automatically generated code (like a Pascal-to-C translator or
something like that), then you can use Karl's suggested workaround
(variations on that will work on most architectures) to insert some padding
between the array base and the base of globals (and later go back and fix
the translator :-).
-----
Shankar Unni                                   E-Mail: 
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

DISCLAIMER:
This response does not represent the official position of, or statement by,
the Hewlett-Packard Company.  The above data is provided for informational
purposes only.  It is supplied without warranty of any kind.