[comp.lang.c] huge problem

PEPRBV%CFAAMP.BITNET@MITVMA.MIT.EDU (Bob Babcock) (03/23/88)

I seem to have found  a problem  with  huge pointers  in Turbo  C
1.5.  The following  program can be compiled  with either Turbo C
or MSC 5.0 to demonstrate  the problem.   Note that both  Borland
and  especially  Microsoft  are a little  vague  about  the exact
properties  of huge pointers,  and since huge is an extension  of
ANSI standard, there's no place beyond their manuals to look.

By the  way,  do people  at Borland  (and  Microsoft)  see  these
lists?  I'll probably file an official bug report eventually, but
so far I haven't even bothered to mail in my registration cards.

----------
#include <stdio.h>

#ifdef __TURBOC__
#include <alloc.h>
#define ALLOCATE(x,y) farcalloc(x,y)
#else
#include <malloc.h>
#define ALLOCATE(x,y) halloc(x,y)
#endif

/* Compile this with the large model flag under Turbo C 1.5 to demonstrate
   an apparent problem with huge pointers to structures.  MSC 5.0 seems to
   get it right.
 */

struct record
   {
   unsigned char string[28];
   unsigned long stuff;
   };

void main(void)
   {
   struct record huge *xx;
   struct record *aa,*bb,*cc;

   xx=(struct record huge *)ALLOCATE(3001L,
                                    (unsigned long)sizeof(struct record));

#ifdef __TURBOC__
   printf("Output when compiled with Turbo C 1.5, large model\n");
#else
   printf("Output when compiled with MSC 5.0, large model\n");
#endif

   if(xx == 0)
      printf("Warning: memory allocation failed\n");

   printf("size of structure record is %d\n", sizeof(struct record));

/* Turbo C claims that huge pointers are always normalized, yet the first
   printf below prints the same segment for all 3 pointers, which is just plain
   wrong for xx[3000].string since it is more than 64K away from xx[0].string.
   More subtle is the problem with xx[2047].string which gets an offset of
   FFEC: the string crosses a segment boundary, so any use is likely to
   suffer from segment wrap around.  The second printf shows a work-around
   which gives the right answers, and may be more efficient if multiple
   references are made to the same structure.

   Microsoft C 5.0 seems to get this right.  The pointers are not normalized,
   but the starting address returned by halloc is adjusted so that the end of
   an element is exactly a segment boundary.  This is apparently why halloc
   requires that the element size be a power of 2 for arrays larger than
   128K.
 */

   printf("xx[0] - %Fp, xx[2047] - %Fp,  xx[3000] - %Fp\n",
      xx[0].string, xx[2047].string, xx[3000].string);

   aa=(struct record *)xx;
   bb=(struct record *)(xx+2047);
   cc=(struct record *)(xx+3000);

   printf("aa -    %Fp, bb -       %Fp,  cc -       %Fp\n",
      aa->string, bb->string, cc->string);


   return;
   }

Output when compiled with Turbo C 1.5, large model
size of structure record is 32
xx[0] - 5767:000C, xx[2047] - 5767:FFEC,  xx[3000] - 5767:770C
aa -    5767:000C, bb -       6765:000C,  cc -       6ED7:000C

Output when compiled with MSC 5.0, large model
size of structure record is 32
xx[0] - 6640:0000, xx[2047] - 6640:FFE0,  xx[3000] - 7640:7700
aa -    6640:0000, bb -       6640:FFE0,  cc -       7640:7700

[Your segment registers will be different, of course]

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/23/88)

In article <12575@brl-adm.ARPA> PEPRBV%CFAAMP.BITNET@MITVMA.MIT.EDU (Bob
Babcock) writes:
>   struct record huge *xx;
...
>   xx=(struct record huge *)ALLOCATE(3001L,...
>/* Turbo C claims that huge pointers are always normalized, yet...
>   More subtle is the problem with xx[2047].string which gets an offset of
>   FFEC: the string crosses a segment boundary, so any use is likely to
>   suffer from segment wrap around.
>   printf("xx[0] - %Fp, xx[2047] - %Fp,  xx[3000] - %Fp\n",
>      xx[0].string, xx[2047].string, xx[3000].string);

As I understand it, you are saying that the address xx[2047].string,
being a huge pointer, ought to be kept always normalized.

But this is not really necessary.  Normalization will occur when this
address is incremented, because the runtime routine to increment a huge
pointer will be called.  Thus code of the form

     while ((*a++ = *b++) != '\0')
	;

will work for huge pointers, because a++ and b++ will correctly cause
the segment value to be incremented when the offset wraps around.
Similarly, if you find the difference (b - a), the runtime routine to
subtract huge pointers ought to correctly return the number of elements
between a and b even if neither a nor b was stored in normalized form.

What really matters is whether dereferencing a pointer, and doing
arithmetic with it in accordance with the rules of C, work correctly.

The problem really lies in the Turbo C documentation, which says that
huge pointers are stored in normalized form.  It should say that huge
pointers are normalized when necessary to correctly handle offset
wrap-around.  By contrast, if you increment a far pointer, offset
wrap-around is not correctly handled.  So for a far pointer, you must
have the value normalized else pointer arithmetic won't work
correctly.  This is the critical difference between far and huge
pointers.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi