[net.micro.att] Hints for C programming on AT&T PC6300 PLUS

hsc@mtuxo.UUCP (h.cohen) (06/28/86)

This was written by one of our developers.  I have deleted the 
author's name to prevent personal interruptions to the author.
*** Process with nroff/troff and mm macros ***
.TL
Porting C Language Programs to the AT&T PC6300 PLUS
.AU
.MT 0
These notes are intended to ease the process of porting C
language programs to the PC6300 PLUS.
The Intel 80286
processor (contained in the PC6300 PLUS) has several
architectural characteristics which distance it from the
other machines running UNIX*
.FS *
UNIX is a trademark of AT&T.
.FE
software.
The most important
of these are a 16 bit word size and base/offset addressing.
The current iAPX286 compiler provides two memory models,
large and small, distinguished by pointer size (16 bit for
small, 32 bit for large).
The fact that pointers have no
predetermined size further complicates the porting of C
programs to the PC6300 PLUS.
.HU "DATA SIZES"
Unlike most of the machines currently running the UNIX
operating system, the PC6300 PLUS has \fIint\fRs which are not the
same length as \fIlong\fRs.
So the following program fragment
will not produce correct results on the PC6300 PLUS (although
it would work fine on a Vax, for example):
.DS I
long    l;
char   *s;
   .
   .
   .
g(l, s);
   .
   .
   .
g(num, str)
int    num;
char  *str;
{
      printf("Error line %d: %s", num, str);
}
.DE
.P
The code generated by the compiler will cause two words to
be placed on the stack for the long \fIl\fR. \fIg\fR() declares its
first argument as an \fIint\fR, so the first work of \fIl\fR will be
printed out as a number.
The second word of \fIl\fR will be used
in the string pointer, probably resulting in a core dump or
printing out garbarge.
.P
This sort of conflict between the types of routine actual
and formal parameters can be detected by using \fBlint\fR.
The
significant conflicts are:
.DS I
long     <==>       int
pointer  <==>       int	       (shows up in large model)
pointer  <==>       long       (shows up in small model)
.DE
.P
When a routine is expecting a long argument, a constant
argument should have an "L" appended (e.g. \fIonelongarg(3L)\fR).
.P
Not all problematic conflicts are detected by \fBlint\fR, for example:
.DS I
long        num;
char        *str;
   .
   .
   .
printf("Error line %d: %s", num, str); 
   .
   .
   .
.DE
.P
This results in the same problem described for the previous 
code fragment.
("%ld" should be used in the format string for
printing the \fIlong\fR).
.P
Another frequent problem occurs in the large model with C
code like:
.DS I
char    *cptr;
cptr = (char *)  malloc(1024);
.DE
.P
where \fImalloc()\fR is never declared to return a pointer, but
has its return value casted to be the appropriate pointer
type.
The compiler will generate code to convert the return
value (16 bits) into the 32 bit pointer resulting in an
illegal segment selector.
The moral here is: return values
for functions must be declared correctly at least to the
point of returning the correct size data.
.HU "POINTER ARITHMETIC"
Certain problems arise with pointer arithmetic on the PC6300 PLUS
due to its segmented address space (addresses consist of a
segment  selector and an offset, both 16 bit).
It is not
uncommon in C code to find loops such as:
.DS I
   .
   .
   .
for (ptr = array; ptr <= &array[LAST]; ptr++)
   .
   .
   .
.DE
.P
where a pointer is incremented past the end of a data
structure before the loop is exited.
The code generated
for the incrementation only increments the 16 bit offset.
So if the data structure is located at the end of a 64K
segment, the offset will wrap around when it is incremented 
past the end, and an infinite loop will result.
The same
problem can arise when stepping a pointer backwards through
an array, terminating when the pointer points \fIbefore\fR the array.
.P
The link editor tries to prevent the most common occurrences
of this phenomenon by reserving at least 4 bytes at the end
of each segment.
Thus, the largest data structure that
may be allocated is 65532 (64K = 65536).
Arrays with elements
larger than 4 bytes and arrays that are stepped through
backwards have no protection from this sort of infinite loop.
.HU "NULL POINTER REFERENCES"
Unlike Vaxes and 3B's the PC6300 PLUS traps and dumps core
when a NULL Pointer is dereferenced.
The following code
may run correctly on a Vax or 3B but will dump core on the
PC6300 PLUS:
.DS I
char   *cptr, firstchar;
   .
   .
   .
firstchar = *cptr;	 /*PC6300 PLUS dumps core here */ 
if (cptr == NULL)
	error();  
else if (firstchar == 'X')   
   .
   .
   .
.DE
.HU "SUMMARY"
A PC6300 PLUS port should begin with passing the programs to be
ported through \fBlint\fR to find type conflicts between routine
actual and formal parameters.
Special care should be taken
to declare function return values correctly.
Pointers
should be used in such a manner that they never point
outside the confines of declared data structures.
And all
references through pointers should be guarded against making
a reference with a \fBNULL\fR value.