[net.lang.c] derived types

cottrell@nbs-vms.ARPA (01/17/85)

/*
one of the constraints not mentioned so far (or i missed it) is that
sizeof(int) must be the same as sizeof(int *). the practice of defining
derived types is an attempt to avoid machine dependency. however, why
not say what we REALLY (yes, my tty has caps) mean: dispense with
char, short, int, long, and use byte, word, long, and quad. my model
is the vax, with sizes of 8, 16, 32, & 64 bits respectively. while
this is somewhat ethnocentric, it pays homage to the fact that unix
was developed on these two machines primarily. weird machines such
as u1108 (b=9, w=18, l=36, q=72) and cdc6400 (b=6?, w=15, l=30, q=60)
would have to adapt as best they can (they kinda have to now anyway).
this unfortunately blows the correspondence between int & int *, but
i suspect that the standard will have something more to say about
ptr's than the opening sentence of this paragraph. (b=10 for cdc6400?)
*/

Doug Gwyn (VLD/VMB) <gwyn@BRL-VLD.ARPA> (01/18/85)

??  I don't know where you got the idea, but sizeof(int) != sizeof(int *)
on many C compilers in wide use today.  Maybe you're thinking of B.

robert@gitpyr.UUCP (Robert Viduya) (01/20/85)

><
> weird machines such as u1108 (b=9, w=18, l=36, q=72) and cdc6400 (b=6?,
> w=15, l=30, q=60) would have to adapt as best they can (they kinda have
> to now anyway).
>

Actually, the CDC 6400 series, as well their 170 series, has 18 bit addresses,
which mean pointers have to be at least that much.  Also, addressing is done
only on 60-bit words.  All arithmetic operations are done with 60-bits per
operand, minimum (except for addressing arithmetic).  The native character
set is a 6-bit one, which means 10 characters per word.  Accessing one character
means loading a word and doing shifts and ands.  In adapting C to that
architecture, it would be more time efficient to store one character per word,
however, storage would be drastically wasted (when you've only got an 18-bit
address space, storage can be at a premium (the architecture doesn't support
virtual memory)).  NOS, one of the operating systems available for the 170
series, supports an 8-in-12 bit character set where each character is
represented as the lower 8-bits in a 12-bit cell (in true ASCII).

If I were to adapt C to the 170 series, I would set the following types:

	int			60-bits
	long int		60-bits
	short int		30-bits (2 per word)
	pointer			30-bits (2 per word, upper 12 bits used for
					 indexing into words)
	char			12-bits (5 per word, strings aligned on
					 word boundarys)
	float			60-bits
	double			60-bits

Obviously, there'll be a lot of wasted time when scanning a string, but
proper register management can minimize that (there are enough registers
to hold everything).  Short ints should be used only to save storage and
nothing else.

				robert
-- 
Robert Viduya
    Office of Computing Services
    Georgia Institute of Technology, Atlanta GA 30332
    Phone:  (404) 894-4669

...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!robert
...!{rlgvax,sb1,uf-cgrl,unmvax,ut-sally}!gatech!gitpyr!robert

MLY.G.SHADES%MIT-OZ@MIT-MC.ARPA (01/21/85)

what we really want is to define c, as knuth did with mix, to say that
char has a minimum holding value of 2^8-1 and short 2^16-1 and long
2^32-1 with int being bound to either short or long.  this eliminates
the implementation problems on all machines because all you do is
select the smallest element that will hold the specified type.

                          shades@mit-oz.arpa

ndiamond@watdaisy.UUCP (Norman Diamond) (01/21/85)

> /*
> one of the constraints not mentioned so far (or i missed it) is that
> sizeof(int) must be the same as sizeof(int *). the practice of defining
> derived types is an attempt to avoid machine dependency. however, why
> not say what we REALLY (yes, my tty has caps) mean: dispense with
> char, short, int, long, and use byte, word, long, and quad. my model
> is the vax, with sizes of 8, 16, 32, & 64 bits respectively. while
> this is somewhat ethnocentric, it pays homage to the fact that unix
> was developed on these two machines primarily. weird machines such
> as u1108 (b=9, w=18, l=36, q=72) and cdc6400 (b=6?, w=15, l=30, q=60)
> would have to adapt as best they can (they kinda have to now anyway).
> this unfortunately blows the correspondence between int & int *, but
> i suspect that the standard will have something more to say about
> ptr's than the opening sentence of this paragraph. (b=10 for cdc6400?)
> */

A lot of micros would have to adapt too.  (Do they kinda have to now
anyway?)

It has been a long time since anyone suggested that a language could be
standardized in such a manner that for some machines, any implementation
is automatically non-standard.

Or there's another solution:  every "primitive" entity can be stored in
the same size unit, the unit being large enough to hold any kind of
entity (double?  in some languages, double complex?).  The excess space
would of course be "intentionally left unused".

-- Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."

ndiamond@watdaisy.UUCP (Norman Diamond) (01/22/85)

> If I were to adapt C to the 170 series, I would set the following types:
> 
>       ...
> 	float			60-bits
> 	double			60-bits
> -- 
> Robert Viduya

Don't you dare.  Double doesn't have to be exactly twice a float, but there
has to be enough to assist numerical calculations in certain kinds of error
determinations.

-- Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."

ndiamond@watdaisy.UUCP (Norman Diamond) (01/23/85)

> what we really want is to define c, as knuth did with mix, to say that
> char has a minimum holding value of 2^8-1 and short 2^16-1 and long
> 2^32-1 with int being bound to either short or long.  this eliminates
> the implementation problems on all machines because all you do is
> select the smallest element that will hold the specified type.
> 
>                           shades@mit-oz.arpa

Except that we'd have to say something like 2^6-1 and 2^12-1.
This is a case where even I would not try to make every program as
portable as it could be.  If there were some reason to assume that
chars could hold 2^8-1, I would assume it (and include a warning).
But if the standard specifies this, it would leave some machines
with an option of (i) non-conforming implementations, (ii) no
implementations, or (iii) wasted space and inefficient code just
to meet this requirement.  And if the standard specifies 2^6-1,
then many people (including me) would be discouraged from writing
standard-conforming programs.

I try to make Pascal programs portable too (standard-conforming
and with a certain amount of care, with exceptions carefully
encapsulated).  But if I need "set of char", I do it and warn
that it might not be portable for that reason.

-- Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."

robert@gitpyr.UUCP (Robert Viduya) (01/24/85)

> > If I were to adapt C to the 170 series, I would set the following types:
> > 
> >       ...
> > 	float			60-bits
> > 	double			60-bits
> > -- 
> > Robert Viduya
> 
> Don't you dare.  Double doesn't have to be exactly twice a float, but there
> has to be enough to assist numerical calculations in certain kinds of error
> determinations.
> 

The CDC 170 series doesn't support reals larger than 60-bits in the hardware.
It can be done, but only through software.  C is defined to calculate all
real numbers as doubles.  This means that there'll be a lot of wasted time
spent doing something the machine can't do with it's native instruction set.

				robert
-- 
Robert Viduya
    Office of Computing Services
    Georgia Institute of Technology, Atlanta GA 30332
    Phone:  (404) 894-4669

...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!robert
...!{rlgvax,sb1,uf-cgrl,unmvax,ut-sally}!gatech!gitpyr!robert

henry@utzoo.UUCP (Henry Spencer) (01/30/85)

> > If I were to adapt C to the 170 series, I would set the following types:
> > 
> > 	float			60-bits
> > 	double			60-bits
> 
> Don't you dare.  Double doesn't have to be exactly twice a float, but there
> has to be enough to assist numerical calculations in certain kinds of error
> determinations.

Sorry, even tired old K&R says (section 4):

	Single-precision floating point (float) and double-precision
	floating-point (double) may be synonymous in some implementations.

This really is inevitable on machines that don't gracefully support a
short floating-point type.  Making float == double really is the right
thing to do when the shortest floating-point type is adequate for
normal use, i.e. 64 bits or thereabouts.  Making double a longer type
generally means a severe speed penalty for all arithmetic, unless the
compiler is new enough to take advantage of the ANSI draft's permissive
language on this point.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

brooks@lll-crg.ARPA (Eugene D. Brooks III) (01/31/85)

> The CDC 170 series doesn't support reals larger than 60-bits in the hardware.
> It can be done, but only through software.  C is defined to calculate all
> real numbers as doubles.  This means that there'll be a lot of wasted time
> spent doing something the machine can't do with it's native instruction set.

This stupidity, due to how the hardware on the FP11 worked (do you guys out
there even remember the PDP11?), ended up in the language definition.  No
sensible compiler writer would kill floating point performance like these
days.  Many C compilers either don't conform to this or have a flag to
cause non-conformity.  Unfortunately my Vax compiler at least still passes
floats on the stack as doubles for the sake of not breaking printf.  I could
even do without that.

Its time we removed this stupidity from the language definition proper and
turned the compatibility problem over to lint!

jsdy@SEISMO.ARPA (02/01/85)

> one of the constraints not mentioned so far (or i missed it) is that
> sizeof(int) must be the same as sizeof(int *). ...

No.  This is something a lot of people believe.  (Especially those who
believe that on the 0th day God created the VAX ... and on days 1-7,
'cuz He liked it so much.)  It is, as you say, an ethnocentrism, and a
hindrance to portable code.  It is  n o t  true.	;-S

Joe Yao		hadron!jsdy@seismo.{ARPA,UUCP}

Doug Gwyn (VLD/VMB) <gwyn@Brl-Vld.ARPA> (02/01/85)

Could we please quit griping about C's floating arithmetic being
done in double?  This misfeature is being fixed in ANSI C.

guy@rlgvax.UUCP (Guy Harris) (02/02/85)

> > The CDC 170 series doesn't support reals larger than 60-bits in the hardware.
> > It can be done, but only through software.  C is defined to calculate all
> > real numbers as doubles.  This means that there'll be a lot of wasted time
> > spent doing something the machine can't do with it's native instruction set.
> 
> This stupidity, due to how the hardware on the FP11 worked (do you guys out
> there even remember the PDP11?), ended up in the language definition.  No
> sensible compiler writer would kill floating point performance like these
> days.

It didn't even *have* to be done for the PDP-11; DEC's Fortran-IV Plus
compiler quite happily generated all the "setf"/"setd" instructions needed
to do single-precision arithmetic.  However, it may not have been worthwhile
to go through the hair involved for doing that in C.

> Its time we removed this stupidity from the language definition proper and
> turned the compatibility problem over to lint!

The current ANSI Standard C draft, I believe, leaves it up to the compiler
writer to decide whether to do floating point arithmetic in single or double
precision.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

cottrell@nbs-vms.ARPA (02/04/85)

/*
uncle! i give! i accept the fact that sizeof(int) may not be sizeof(int *).
i DO believe that sizeof(foo *) should be sizeof(bar *). otherwise it's
just too confusing. more irrational viewpoints later.
*/

guy@rlgvax.UUCP (Guy Harris) (02/05/85)

> i DO believe that sizeof(foo *) should be sizeof(bar *). otherwise it's
> just too confusing. more irrational viewpoints later.

It's too confusing only if you're easily confused.  Cast your d*mn pointers
properly (like "lint" tells you to do) and lots of problems go away.
If you absolutely *need* a pointer that can point to any one of ten
things, and a union of ten pointer types won't do, use "char *" (or "void *"
if/when it appears in the ANSI standard).

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

ndiamond@watdaisy.UUCP (Norman Diamond) (02/08/85)

> /*
> uncle! i give! i accept the fact that sizeof(int) may not be sizeof(int *).
> i DO believe that sizeof(foo *) should be sizeof(bar *). otherwise it's
> just too confusing. more irrational viewpoints later.
> */

I agree that it is confusing when sizeof(foo *) != sizeof(bar *).
Fortunately, machines like this are invented a little bit less frequently
than those that have sizeof(int) != sizeof(int *).
However, the question remains:

When a machine has such confusing (and obnoxious and <ROT13>) characteristics,
we have a choice of:
(1)  Reflecting it in C,
(2)  Wasting memory so that smaller pointers can be allocated the same amount
       of memory as larger pointers, or
(3)  Not allowing C compilers to exist for that machine.
-- 

   Norman Diamond

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdaisy!ndiamond
CSNET: ndiamond%watdaisy@waterloo.csnet
ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

"Opinions are those of the keyboard, and do not reflect on me or higher-ups."