[comp.lang.c] Portable Code

g-rh@cca.CCA.COM (Richard Harter) (03/08/88)

In article <1124@silver.bacs.indiana.edu> backstro@silver.UUCP (Dave White) writes:
>In article <22314ad9@ralf.home> Ralf.Brown@B.GP.CS.CMU.EDU writes:
>>In article <1106@silver.bacs.indiana.edu>, backstro@silver.bacs.indiana.edu
>>(Dave White) writes:
>>}When will [PC C compiler vendors] realize that some of us really want
>>}the option of using 32-bit ints to port Unix-born code?  
>>
>>#define int long
>>
>>Need I say more?
>It isn't that simple. Your suggestion would ensure that the
>compiler thought the code I was compiling saw ints and longs
>the same way, but this would not be the case with the compiler's
>run-time library, which would need to be recompiled and debugged.
>alloc and friends, for example, take int-sized arguments in my machine!

The idea is right, but insufficient.  One way to do things is thusly:

Replace all declaration types int, long, char, float, double by INT,
LONG, CHAR, FLOAT, DOUBLE, with the obvious definitions in your home
system.

Define, for each library routine used, typdefs for each argument,
and for the return.  For each library routine call, cast the argument(s).

All this good stuff goes into an include file that all files include.
In the target machine the type definitions change, according to the
target machine.  This can be taken care of conveniently by ifdef's.

As a matter of practice, it is more better if each library routine
used appears explicitly only once in your code, e.g. your standard
include file contains definitions for each library routine in terms
of itself.  [This doesn't work for printf and family which have
variable number of arguments.]

This resolves the first level of portability;  everything is casted
to the right size.  It does not help you if your home machine has
32 bit ints, and you actually have quantities which are greater than
16 bits in size which you pass to library routines.  It also doesn't
help if the calling sequences in the target machine use different
calling sequences.

You can do this conversion as a quasi-mechanical process; i.e. you
do greps on the original code to generate lists of files to be changed,
and edit scripts to implement the changes.  It's only quasi-mechanical
because you have to do hand checks on the results.

-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

brian@radio.toronto.edu (Brian Glendenning) (07/22/88)

I am currently involved with porting/extending a networked graphics server I
wrote for Sun 3 machines to Iris workstations, and it will likely be ported to
new machines in the future.

So I thought I'd ask the net for advice about how to write portable code. I
know the topic is incredibly broad, but I expect I'm not the only person on
the net who could benefit from some advice (e.g., people like myself who
aren't primarily programmers and don't know the lore and wisdom of
professionals).

Some issues that could be addressed:

        1) Byte order and type size differences. What is the best way for
           dealing with these? What are the "gotcha"'s?
        2) BSD/SysV/whatever differences. What assumptions are likely to
           lead me into trouble?
        3) Source code management: what's the best way to maintain codes that
           run on a variety of machines. #ifdef MACHINE_TYPE? Never or rarely 
           use #ifdef, edit makefiles? ???
	4) Everything I've forgotten :-)

Replies containing simple do's and dont's should be mailed to me and I will
summarize to the net. Controversial or complex ideas should be posted to
the net for discussion.

Thanks in advance.

-- 
Brian Glendenning                INTERNET - brian@radio.astro.toronto.edu
Radio Astronomy, U. Toronto          UUCP - {uunet,pyramid}!utai!radio!brian
+1 (416) 978-5558                  BITNET - glendenn@utorphys.bitnet

jdp@adiron.UUCP (Powell) (07/28/88)

In article <1157@radio.toronto.edu>, brian@radio.toronto.edu (Brian Glendenning) writes:
> 
> I am currently involved with porting/extending a networked graphics server I
> wrote for Sun 3 machines to Iris workstations, and it will likely be ported to
> new machines in the future.
> ...
> Some issues that could be addressed:
> 
>         1) Byte order and type size differences. What is the best way for
>            dealing with these? What are the "gotcha"'s?

I use a conversion program to transfer data files from one machine
to another.  dd conv=swab is sufficient for short data.  Long, float, and
double are a different story.
I use sizeof (variable) rather than sizeof (type) whenever possible.

>         2) BSD/SysV/whatever differences. What assumptions are likely to
>            lead me into trouble?

I have not dealt with this too much.

>         3) Source code management: what's the best way to maintain codes that
>            run on a variety of machines. #ifdef MACHINE_TYPE? Never or rarely 
>            use #ifdef, edit makefiles? ???

I use #if defined(XXX) for each machine type.  The XXX is sun for SUNs,
vax for VAXEN and sgi for Iris (I assume this is the SILICON GRAPHICS IRIS).
The symbol sgi is predefined on SILICON GRAPHICS IRIS system.

Watch out for differences in include file names and types.  On the
version I used, the struct direct went with read(), struct dirent went
with readdir().  Not exactly what I expected.

> 	4) Everything I've forgotten :-)

The only displayable data type on the Iris I worked with was "short".
Byte data had to be converted to short before interfacing to the graphics
routines.

Position (0,0) on Iris is the lower left corner.  Position (0,0) on the
SUN is the upper left corner.

The Iris has at least 2 modes of operating within graphics.  One requires
their window manager and one disallows the window manager.  It is possible
within the program to determine which mode you're in.  The routines to
be used in either case are completely different.

Iris operates as if it were in the SUN mode in which "click-to-type"
was true.  Moving the cursor does not detach control from a particular
window.

Fortran and C mixtures require bridge functions.  The Fortran and C
interface is entirely different from Berkeley.

Good luck.


					John D. Powell
					PAR Technology

brian@radio.toronto.edu (Brian Glendenning) (07/29/88)

A week or so ago I asked for advice on how to write portable C code. This is
the promised summary. If you would like me to send along the unedited messages
(including Henry Spencer's 10 commandments for C programmers) I'd be happy to
do so. In order to save net bandwidth I've edited this down pretty hard, maybe
too hard.

This message is based on the responses of the following people (thanks!):
     ray@amsdsg (Ray Ryan)
     henry@utzoo.uucp (Henry Spencer)
     rsalz@pineapple.bbn.com (rich $alz)
     msb@sq.com (Mark Brader)
     proxftl!bill (T. William Wells)
     flaps@dgp (Alan J Rosenthal)
     chip@vector (Chip Rosenthal)
     jim@dandelion.ci.com (Jim Hurt)


Leading >'s are the relevant question from my initial message, followed by a
summary of the responses. Errors are to be attributed to my misunderstanding,
not to the above respondents.

>        1) Byte order and type size differences. What is the best way for
>           dealing with these? What are the "gotcha"'s?


Byte order problems are most serious in networks. Other things to watch for
are multicharacter constants (e.g. don't use int x='ab').

Encapsulate size information in typedef's (e.g. typedef short WORD16).  Be
careful in printf statements, %d is not used for longs. It is best to cast to
long and use %ld if the sizes are hidden in typedef's. To avoid assuming a
particular size for a variable you can use bit expressions like x |= ~7 rather
than x |= 0xFFF8 and assuming the variable is 16 bits long.

It is safe to assume char is at least 8 bits, short and int at least 16, and
long at least 32, and that the unsigned types are the same length as the
signed types. You cannot assume that char is signed or unsigned. You should
avoid mixing signed and unsigned types in arithmetic or compare operations
unless you know there are no negative values.

You must of course be careful of function arguments, especially constants.
You must not assume pointers can be freely converted to integers. NULL (0)
must be cast if it is a function argument. Don't write into one member of a
union and read from another that has a different type.


>        2) BSD/SysV/whatever differences. What assumptions are likely to
>           lead me into trouble?

tty mode settings and esoteric library routines and system calls will cause
problems. It will often not be possible to write common code, and two versions
will be required.

Many machines (e.g. Suns) have mixed environments, where you can, e.g., use
memcpy instead of bcopy (and memcpy is the ANSI mandated function).

>        3) Source code management: what's the best way to maintain codes that
>           run on a variety of machines. #ifdef MACHINE_TYPE? Never or rarely 
>           use #ifdef, edit makefiles? ???

Most agreed that it was better to #ifdef on specific characteristics then to
#ifdef on machine type. For example, do not do:
	#ifdef	BSD
	#define strchr index
	#define strrchr rindex
	#endif	/* BSD */
but instead do:
	#ifdef USES_INDEX
	#define strchr index
	#define strrchr rindex
	#endif	/* USES_INDEX */

It was also widely believed that #ifdef's should be kept to a minimum since
they can make management awkward. For things that are very different (e.g. 
networking) it is better to use a consistent internal interface and build
different libraries for each interface.

It is helpful to have a config.h file that contains "all" the #ifdef
statements, and to keep the Makefile the same for all machines. 


>        4) Everything I've forgotten :-)

Read and follow Henry Spencer's 10 commandments for C programmers.  Buy a copy
of "Portable C and UNIX System Programming" by J.E.Lapin.

Don't write	#define MAC(xx) "xx"
which gives different results on different systems.  There's no portable
way to write a macro MAC such that MAC(k) would expand to "k" or 'k'.

Varargs.  There is no portable way to define a function that takes a
varying number of arguments.  If you try, you will at best land yourself
in a bunch of #ifdefs.  Better to design your functions to that each one
takes a fixed number of arguments.

Keep the significant parts of at least your external variable names short.


And finally, Jim Hurt <jim@dandelion.ci.com> sent me some general meta-rules.
I just include his points here, his rationales are included in the unedited
file I'll send out on request.

1.  Determine what computer/system combination is preferred by the people
actually generating the code.  Under no circumstances allow them to generate
code on that machine.

2.  Never do your code development on a machine made by Digital Equipment
Corporation.  These machines should be the first machine that your code gets
ported to.

3.  Select a language that has an ANSI standard, then use copies of that
standard as the programming language manual for use by your coders.  Do not
let your coders have access to the language manual provided by your computer
supplier.

4.  Carefully isolate your machine dependent code in a few very
carefully designed procedures.


I suggest further discussion, if any, now be directed at the net. Thanks again.


-- 
Brian Glendenning                INTERNET - brian@radio.astro.toronto.edu
Radio Astronomy, U. Toronto          UUCP - {uunet,pyramid}!utai!radio!brian
+1 (416) 978-5558                  BITNET - glendenn@utorphys.bitnet

mark@cbnews.ATT.COM (Mark Horton) (07/29/88)

In article <1157@radio.toronto.edu> brian@radio.astro.toronto.edu (Brian Glendenning) writes:
>
>So I thought I'd ask the net for advice about how to write portable code. I
>know the topic is incredibly broad, but I expect I'm not the only person on
>the net who could benefit from some advice (e.g., people like myself who
>aren't primarily programmers and don't know the lore and wisdom of
>professionals).

Professionals have to learn this stuff some way too, and up until now the
only option is experience and the seat of your pants.

You might be interested in an upcoming book I'm writing called "How to
Write Portable Software in C".  It covers most of these sorts of issues.
It's from Prentice Hall and should hit the stores next spring or summer.

>Some issues that could be addressed:
>
>        1) Byte order and type size differences. What is the best way for
>           dealing with these? What are the "gotcha"'s?

In general, don't make assumptions about them.  Things like

	int c;
	read(0, &c, 1);
	if (c == '\n')

will work on a little endian machine like a VAX but fail on a big
endian machine like a Sun.  There are zillions of potential gotchas
like this, all of which come from assuming something in the folklore
but not in the manual.  (You aren't supposed to read into ints, just
into chars, if you intend the data type to be a char.)

>        2) BSD/SysV/whatever differences. What assumptions are likely to
>           lead me into trouble?

Zillions of them.  Without the book, which lists the functions and rates
their portability, your best bet is to get both a System V and a 4BSD
manual and verify that everything you do is in both manuals.  That is far
easier said than done.  A typical UNIX system manual will imply that
everything in it is portable, including the local enhancements that are
not present anywhere else.

>        3) Source code management: what's the best way to maintain codes that
>           run on a variety of machines. #ifdef MACHINE_TYPE? Never or rarely 
>           use #ifdef, edit makefiles? ???

There are several choices:

	#ifdef vax
		This is useful if you need to key on the hardware type.
		It's automatically defined by cpp.

	#ifdef SYSV
		Useful to key on the operating system.  You have to define
		this with -D or #define.

	#ifdef DBM
		Keying on a particular feature or option you need, you might
		have several of these and allow each to be configured.  You
		must define these yourself.

	#ifdef FIONREAD
		Keying on some constant in a header file that is an indicator
		of whether the feature you need is present.  These are
		automatic but you can't always use them.  In this case you
		might be using this ifdef to protect an ioctl(.. FIONREAD ..)

>	4) Everything I've forgotten :-)

It took a 350 page book to cover this, so it's hard to do in a Usenet
message.  I recommend buying the book, if you can wait.

	Mark Horton

pardo@june.cs.washington.edu (David Keppel) (07/29/88)

Can somebody send me or tell me where to get a copy of the Indian
Hills C Style manual?  I had one once, but now all I have is the
hardcopy.  Advance thanks.

	;-D on  ( But then who cares about style? )  Pardo

		    pardo@cs.washingtone.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo

peter@ficc.UUCP (Peter da Silva) (08/01/88)

Here's one... don't assume that ~0 (all 1s) == -1. On a ones-complement machine
-1 is (of course) ~1.
-- 
Peter da Silva, Ferranti International Controls Corporation, sugar!ficc!peter.
"You made a TIME MACHINE out of a VOLKSWAGEN BEETLE?"
"Well, I couldn't afford another deLorean."
"But how do you ever get it up to 88 miles per hour????"