[comp.sys.atari.st] Type coercion

silvert@dalcs.UUCP (04/04/87)

Since I occasionally post source to the net, I thought it might be good
to try to reach a consensus on whether type coercion is desirable in
portable code, or whether, as the books say, it is a despicable and
dangerous practice which ranks with goto's.

I refer specifically to date and time functions, which sometimes use two
separate integer/CARDINAL variables (I refer to C/Modula-2 here), but
sometimes pack both into a long/LONGCARD variable.  Conversion of one to
the other involves either arithmetic or bitswitching, and can lead to
problems such as the recently reported type-extension bug in settime and
autodisk.

Type coercion in this case means putting both formulations in the same
location by using a union:

	union { long datetime;
		structure { int date;
			    int time;
			  } pair
		} trick;

(apologies if I haven't got the syntax quite right)
or a variant record in Modula-2

	TYPE trick = RECORD
			CASE : BOOLEAN OF
			TRUE : datetime : LONGCARD |
			FALSE: date, time : CARDINAL
			END
		     END;

Then once you have obtained the value of datetime you have date and time
ready to work with, and vice versa, and the conversions are automatic.

I have used this in my own datesetting program and find it convenient,
but wonder if it creates problems with other compliers and languages.
Pascal also has the variant record feature, but this use of type
coercion is illegal -- whether the compilers on the ST permit it I don't
know.  Clearly this usage assumes that structures are stored
consecutively, which I think is always the case.

Anyway, comments on this usage would be appreciated.  If I get a lot of
mail, I will post the conclusions to the net.
-- 
Bill Silvert, Modelling/Statistics Group
Bedford Institute of Oceanography, Dartmouth, NS, Canada
CDN or BITNET: silvert@cs.dal.cdn	-- UUCP: ..!{seismo|utai}!dalcs!silvert
ARPA: silvert%dalcs.uucp@seismo.CSS.GOV	-- CSNET: silvert%cs.dal.cdn@ubc.csnet

wheels@mks.UUCP (04/06/87)

In article <2499@dalcs.UUCP>, silvert@dalcs.UUCP (Bill Silvert) writes:
 > Type coercion in this case means putting both formulations in the same
 > location by using a union:
 > 
 > 	union { long datetime;
 > 		structure { int date;
 > 			    int time;
 > 			  } pair
 > 		} trick;
 > 
 > (apologies if I haven't got the syntax quite right)
 > 
 > I have used this in my own datesetting program and find it convenient,
 > but wonder if it creates problems with other compliers and languages.
 > Clearly this usage assumes that structures are stored

I haven't used unions much, but isn't it true that C may attempt to
re-align the items within a structure (or union) to lie on proper
boundaries (e.g. word boundaries, longword boundaries, etc) where
necessary? This means that there may be padding bytes within the
structure. Could this cause errors where the elements of the structures
within the union don't lie "on top of one another" as one would
expect?
-- 
Gerry Wheeler                  {seismo,decvax,ihnp4}!watmath!mks!wheels
Mortice Kern Systems Inc.

manis@ubc-cs.UUCP (04/07/87)

In article <2499@dalcs.UUCP> silvert@dalcs.UUCP (Bill Silvert) writes:

>Type coercion in this case means putting both formulations in the same
>location by using a union:
>
>	union { long datetime;
>		structure { int date;
>			    int time;
>			  } pair
>		} trick;
>
>(apologies if I haven't got the syntax quite right)
>or a variant record in Modula-2
>
>	TYPE trick = RECORD
>			CASE : BOOLEAN OF
>			TRUE : datetime : LONGCARD |
>			FALSE: date, time : CARDINAL
>			END
>		     END;
>
>Then once you have obtained the value of datetime you have date and time
>ready to work with, and vice versa, and the conversions are automatic.

I won't bother to correct your C syntax, Bill; it's certainly close enough.
What you're referring to is not coercion (which just refers to the normal
conversions in a language) but "type-cheating".

Is it moral? Well, in the strictest sense, no. However, it is impossible
to write code which is simultaneously totally portable and yet manages to
access system-dependent information.

>Pascal also has the variant record feature, but this use of type
>coercion is illegal -- whether the compilers on the ST permit it I don't
>know.

I know of no Pascal compiler which actually checks for consistent use
of variants. In particular, Pascal allows untagged variants (just like
C unions), in which of course the programmer can select any variant without
regard to which variant is presently valid. Of course, the ISO standard says
this is illegal, but all they mean is that the onus is on the programmer (or 
possibly the compiler) to determine what is actually meant (i.e., how the 
record is laid out in memory).

>Clearly this usage assumes that structures are stored
>consecutively, which I think is always the case.

You (and anyone who compiles your code) must know how the compiler lays out
the information (in that regard, C is safer than Modula-2, because the draft
ANSI standard says that fields must be laid out in order of the
declarations, while Modula-2 compilers are not so constrained. 

You also have to beware of alignment bytes which some compilers insert.

As a suggestion, here are some guidelines:

a) Don't type-cheat more than you have to.

b) In C, Pascal, and Modula-2, do type-cheating by means of either an
explicit type declaration (as above), or by means of a cast (Pascal 
doesn't have casts, but Modula-2 and C do).

c) Localise type-cheating code to a specific module.

d) Put in plentiful comments showing the intent of the code.

-----
Vincent Manis                {seismo,uw-beaver}!ubc-vision!ubc-cs!manis
Dept. of Computer Science    manis@cs.ubc.cdn
Univ. of British Columbia    manis%ubc.csnet@csnet-relay.arpa  
Vancouver, B.C. V6T 1W5      manis@ubc.csnet
(604) 228-6770 or 228-3061

"BASIC is the Computer Science equivalent of 'Scientific Creationism'."

dillon@CORY.BERKELEY.EDU (Matt Dillon) (04/13/87)

>I haven't used unions much, but isn't it true that C may attempt to
>re-align the items within a structure (or union) to lie on proper
>boundaries (e.g. word boundaries, longword boundaries, etc) where
>necessary? This means that there may be padding bytes within the
>structure. Could this cause errors where the elements of the structures
>within the union don't lie "on top of one another" as one would
>expect?

	C will always align entries in a structure or union properly for
that machine, but you should never count on it.  If alignment is absolutely
required for your application, you can assume that C will not re-align
anything already on sizef(char *) (usually longword) boundries.  chars are
not usually re-aligned.  In anycase, this is portable over most machines.

	In terms of having two data items conflicting in a union, here is
an example (this is not neccessarily how your compiler may do things):

	struct {
		char c;
		union {
			char d;
			char *x;
		} u;
	};

	&c = +0
	&u.d = +4
	&u.x = +4

	[c][unused][unused][unused][d][][][]
				   [x..4byt]

	You should *never* rely on unions aligning things in some known manner
if the data items inside the union are different sizes.

				-Matt