silvert@dalcs.UUCP (04/04/87)
Since I occasionally post source to the net, I thought it might be good to try to reach a consensus on whether type coercion is desirable in portable code, or whether, as the books say, it is a despicable and dangerous practice which ranks with goto's. I refer specifically to date and time functions, which sometimes use two separate integer/CARDINAL variables (I refer to C/Modula-2 here), but sometimes pack both into a long/LONGCARD variable. Conversion of one to the other involves either arithmetic or bitswitching, and can lead to problems such as the recently reported type-extension bug in settime and autodisk. Type coercion in this case means putting both formulations in the same location by using a union: union { long datetime; structure { int date; int time; } pair } trick; (apologies if I haven't got the syntax quite right) or a variant record in Modula-2 TYPE trick = RECORD CASE : BOOLEAN OF TRUE : datetime : LONGCARD | FALSE: date, time : CARDINAL END END; Then once you have obtained the value of datetime you have date and time ready to work with, and vice versa, and the conversions are automatic. I have used this in my own datesetting program and find it convenient, but wonder if it creates problems with other compliers and languages. Pascal also has the variant record feature, but this use of type coercion is illegal -- whether the compilers on the ST permit it I don't know. Clearly this usage assumes that structures are stored consecutively, which I think is always the case. Anyway, comments on this usage would be appreciated. If I get a lot of mail, I will post the conclusions to the net. -- Bill Silvert, Modelling/Statistics Group Bedford Institute of Oceanography, Dartmouth, NS, Canada CDN or BITNET: silvert@cs.dal.cdn -- UUCP: ..!{seismo|utai}!dalcs!silvert ARPA: silvert%dalcs.uucp@seismo.CSS.GOV -- CSNET: silvert%cs.dal.cdn@ubc.csnet
wheels@mks.UUCP (04/06/87)
In article <2499@dalcs.UUCP>, silvert@dalcs.UUCP (Bill Silvert) writes: > Type coercion in this case means putting both formulations in the same > location by using a union: > > union { long datetime; > structure { int date; > int time; > } pair > } trick; > > (apologies if I haven't got the syntax quite right) > > I have used this in my own datesetting program and find it convenient, > but wonder if it creates problems with other compliers and languages. > Clearly this usage assumes that structures are stored I haven't used unions much, but isn't it true that C may attempt to re-align the items within a structure (or union) to lie on proper boundaries (e.g. word boundaries, longword boundaries, etc) where necessary? This means that there may be padding bytes within the structure. Could this cause errors where the elements of the structures within the union don't lie "on top of one another" as one would expect? -- Gerry Wheeler {seismo,decvax,ihnp4}!watmath!mks!wheels Mortice Kern Systems Inc.
manis@ubc-cs.UUCP (04/07/87)
In article <2499@dalcs.UUCP> silvert@dalcs.UUCP (Bill Silvert) writes: >Type coercion in this case means putting both formulations in the same >location by using a union: > > union { long datetime; > structure { int date; > int time; > } pair > } trick; > >(apologies if I haven't got the syntax quite right) >or a variant record in Modula-2 > > TYPE trick = RECORD > CASE : BOOLEAN OF > TRUE : datetime : LONGCARD | > FALSE: date, time : CARDINAL > END > END; > >Then once you have obtained the value of datetime you have date and time >ready to work with, and vice versa, and the conversions are automatic. I won't bother to correct your C syntax, Bill; it's certainly close enough. What you're referring to is not coercion (which just refers to the normal conversions in a language) but "type-cheating". Is it moral? Well, in the strictest sense, no. However, it is impossible to write code which is simultaneously totally portable and yet manages to access system-dependent information. >Pascal also has the variant record feature, but this use of type >coercion is illegal -- whether the compilers on the ST permit it I don't >know. I know of no Pascal compiler which actually checks for consistent use of variants. In particular, Pascal allows untagged variants (just like C unions), in which of course the programmer can select any variant without regard to which variant is presently valid. Of course, the ISO standard says this is illegal, but all they mean is that the onus is on the programmer (or possibly the compiler) to determine what is actually meant (i.e., how the record is laid out in memory). >Clearly this usage assumes that structures are stored >consecutively, which I think is always the case. You (and anyone who compiles your code) must know how the compiler lays out the information (in that regard, C is safer than Modula-2, because the draft ANSI standard says that fields must be laid out in order of the declarations, while Modula-2 compilers are not so constrained. You also have to beware of alignment bytes which some compilers insert. As a suggestion, here are some guidelines: a) Don't type-cheat more than you have to. b) In C, Pascal, and Modula-2, do type-cheating by means of either an explicit type declaration (as above), or by means of a cast (Pascal doesn't have casts, but Modula-2 and C do). c) Localise type-cheating code to a specific module. d) Put in plentiful comments showing the intent of the code. ----- Vincent Manis {seismo,uw-beaver}!ubc-vision!ubc-cs!manis Dept. of Computer Science manis@cs.ubc.cdn Univ. of British Columbia manis%ubc.csnet@csnet-relay.arpa Vancouver, B.C. V6T 1W5 manis@ubc.csnet (604) 228-6770 or 228-3061 "BASIC is the Computer Science equivalent of 'Scientific Creationism'."
dillon@CORY.BERKELEY.EDU (Matt Dillon) (04/13/87)
>I haven't used unions much, but isn't it true that C may attempt to >re-align the items within a structure (or union) to lie on proper >boundaries (e.g. word boundaries, longword boundaries, etc) where >necessary? This means that there may be padding bytes within the >structure. Could this cause errors where the elements of the structures >within the union don't lie "on top of one another" as one would >expect? C will always align entries in a structure or union properly for that machine, but you should never count on it. If alignment is absolutely required for your application, you can assume that C will not re-align anything already on sizef(char *) (usually longword) boundries. chars are not usually re-aligned. In anycase, this is portable over most machines. In terms of having two data items conflicting in a union, here is an example (this is not neccessarily how your compiler may do things): struct { char c; union { char d; char *x; } u; }; &c = +0 &u.d = +4 &u.x = +4 [c][unused][unused][unused][d][][][] [x..4byt] You should *never* rely on unions aligning things in some known manner if the data items inside the union are different sizes. -Matt