[comp.lang.c] C machine

rwa@auvax.UUCP (Ross Alexander) (12/18/87)

(I have moved this from comp.arch for obvious reasons)

In article <8226@steinmetz.steinmetz.UUCP>, davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) writes:
> ... and in keeping with being typeless but
> needing floating point, [I] used floating operators instead. As I recall the
> assignment operator was changed to ":=" ala Pascal, equality was just
> "=", and inequality was "<>" like BASIC. This made the code somewhat
> easier to read if you didn't know the language.

I remember Tom Duff hacking on our GCOS/TSS implementation of B, long long
ago, with exactly the same intent.  He created a whole whack of <op>$ things
(i.e., +$, -$, *$, /$) which assumed the object was a float.  Not a nice
language to actually use...

BTW, did you know that labels were variables preinitialized to point
to whereever in the code they were declared?  This let you do the old
COBOL hack of 'ALTER PROCEDURE <FOO> TO PROCEED TO <BAR>' by saying
'foo = bar;' where somewhere or other 'foo:  <code>;' and 'bar:
<code>;' were lurking in that procedure.  Then 'goto foo;' transferred
you to bar:  (ugh!).  Dave Conroy was famous for declaring labels to
be used as static temporaries rather than as branch targets.  

In fact, even funtions were really variables that just happened to
point to code that would perform the procedure (that is, the code
itself was anonymous, access to a proc was strictly by indirecting
through a named object that pointed at the code for the proc).  I
exploited this to do various grotty things, such as tell lies to
library routines at runtime (an early version of _very_ dynamic
bindings in a staticly compiled language ;-), personally; I'm older
now & have learned better.  

--
Ross Alexander @ Athabasca University
alberta!auvax!rwa

chris@mimsy.UUCP (Chris Torek) (12/21/87)

[I am moving this to comp.lang.c]
>In article <164@sdeggo.UUCP> dave@sdeggo.UUCP (David L. Smith) writes:
>>Besides, there's nothing that says you can't write a string package which
>>has the string preceeded by the length.

It is not hard, but it is annoying:

In article <3078@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes:
>... the problem is that the C compiler only supports null-terminated
>string constants (of the form "I am a string").  Either you have to
>learn to live without string constants, or put up with having to
>initialize everything with 'cvt_null2count ("string constant")' at run time.

The following works:

	typedef struct string {
		int	s_len;
		char	*s_str;
	} string;
	#define	STRING(x) { sizeof(x) - 1, x }

	string foo = STRING("this is a counted string");

Unfortunately, some compilers (including 4BSD PCC) then generate
the string twice in data space, and the `xstr' program only makes
it worse.  In addition, since automatic aggregate initialisers are
not permitted, and there are no aggregate constants, automatic data
must be initialised with something like this:

	f()
	{
		static string s_init = STRING("initial value for s");
		string s = s_init;
		...

(I believe the dpANS allows automatic aggregate initialisers.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bhj@bhjat.UUCP (Burt Janz) (12/25/87)

I realize that this may have been discussed before, but I have a minor
problem... no, not a problem... more like a question.

Ok.  K&R states that the PDP-11, a 16-bit machine, defines an int as
16 bits, a short as 16 bits, and a long as 32 bits.  The VAX defines an
int as 32 bits, a short as 16 bits, and a long as 32 bits.  Altos defines
their 386 UNIX compiler's sizes the same as the VAX.  The 3B1 defines their
68010 compiler's sizes the same as the 386 (although the byte ordering is
different - doesn't matter for this application).

In the "white book" on page 34, there is a brief discussion of "..."natural"
size for a particular machine...".  I assume that means the internal register
word size, not the bus transfer word size.

I hold that a short is defined as ALWAYS being 16 bits, and a long as
ALWAYS being 32 bits.  I don't know if I'm right in this regard, but being
stubborn, I always press the point of previous compiler/machine definitions.

So, on a 64-bit processor, what's an int?  For that matter, on machines larger
than 32 bits, what would short and long be?  I'm not particularly interested
in float and double, as those would be functions of the math routines being
used, or the math chips available for the machine.

I ask the opinions, and/or expert knowledge, of those on the net.

Burt Janz
..decvax!bhjat!bhj (preferred path)
..decvax!bhjatt!bhj
(sorry, no direct phone #...)

chris@mimsy.UUCP (Chris Torek) (12/27/87)

In article <163@bhjat.UUCP> bhj@bhjat.UUCP (Burt Janz) writes:
>I hold that a short is defined as ALWAYS being 16 bits, and a long as
>ALWAYS being 32 bits.  I don't know if I'm right in this regard,

Nope.  The dpANS says only that short >= 16 bits, int >= 16 bits, and
long >= 32 bits.

>but being stubborn, I always press the point of previous compiler/
>machine definitions.

Personally, I prefer the point of usefulness.

>So, on a 64-bit processor, what's an int?

I would say that, if the machine has a 64 bit address space and 64
bit arithmetic, but the `natural' arithmetic size is 32 bits, it
is 32 bits; otherwise it is 64.  I would make longs 64 (or perhaps
even 128) bits, and shorts quite possibly still 16 bits.  It would
be nice to have 8, 16, 32, and 64 bit values, and it seems natural
to make those `char', `short', `int', and `long' respectively.  This
might be a problem for program that assume either short==int or
long==int, though.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

msb@sq.uucp (Mark Brader) (12/29/87)

On the topic of of short/int/long, Chris Torek (chris@mimsy.UUCP) writes:
> Nope.  The dpANS says only that short >= 16 bits, int >= 16 bits, and
> long >= 32 bits.

This is misleading in the context; the dpANS *also* says that short <= int
and int <= long.  So the method adopted by a 64-bit machine in the past,
where long = 32 bits "because that's what everybody assumes", and
int = 64 bits "because that's the natural size", would be non-conforming.

On a 64-bit machine, I would think that int = 32 bits, long = 64 bits,
would be natural.  However, if I found that I was dealing with a lot
of code that thought long = 32 bits or if needlessly "long" arrays were
chewing up my memory, I would consider having int = 32 bits, long = 32 bits,
"long long" = 64 bits.  "long long" is allowed by the dpANS as a "common
extension".  It renders the program non-portable but any program that
needs more than 32 bits in an integer is non-portable in any case.

Mark Brader		"VAX 3 in 1 carpet care -- now 129.95 pounds"
utzoo!sq!msb, msb@sq.com

cruff@scdpyr.UUCP (Craig Ruff) (12/30/87)

In article <1987Dec28.163403.24137@sq.uucp> msb@sq.UUCP (Mark Brader) writes:
>
>On the topic of of short/int/long, Chris Torek (chris@mimsy.UUCP) writes:
>> Nope.  The dpANS says only that short >= 16 bits, int >= 16 bits, and
>> long >= 32 bits.
>
>This is misleading in the context; the dpANS *also* says that short <= int
>and int <= long.  So the method adopted by a 64-bit machine in the past,
>where long = 32 bits "because that's what everybody assumes", and
>int = 64 bits "because that's the natural size", would be non-conforming.

Here is the scoop on some Cray machines.  All numbers are in bits, in the
format precision(memory), where memory = amount of memory to store item.

		Cray-1, Cray X-MP	Cray-2
		-----------------	------------
	char	8(8)			8(8)
	int	64(64)			32(64)
	short	24(64)			32(64)
	long	64(64)			64(64)
	float	64(64)			64(64)
	double	64(64)			64(64)
	char *	64(64)			64(64)
	other *	24(64)			24(64)

Note that these are the not-all-pointers-are-alike machines, and pointers
cannot always be converted to ints and back again.  Character pointers are
stored as a 24 bit word address and a 3 bit offset (in the most significant
bits).
-- 
Craig Ruff      NCAR                         INTERNET: cruff@scdpyr.UCAR.EDU
(303) 497-1211  P.O. Box 3000                   CSNET: cruff@ncar.CSNET
		Boulder, CO  80307               UUCP: cruff@scdpyr.UUCP

rgr@m10ux.UUCP (Duke Robillard) (12/31/87)

In article <163@bhjat.UUCP> bhj@bhjat.UUCP writes:
>I hold that a short is defined as ALWAYS being 16 bits, and a long as
>ALWAYS being 32 bits.....
>So, on a 64-bit processor, what's an int?  For that matter, on machines larger
>than 32 bits, what would short and long be?  


According to our Cray programmers, on that 64-bit processor, a byte, a short,
an int, and a long are all 64 bits.  kinda weird.  It must make something
really fast....


...
...


-- 
  |      Duke Robillard           {ihnp4!}m10ux!rgr                    |
  |      AT&T Bell Labs           m10ux!rgr@ihnp4.UUCP                 |
  |      Murray Hill, NJ          This page accidentally left blank    |
  +--------------------------------------------------------------------+

andrew@teletron.UUCP (Andrew Scott) (01/01/88)

I've got a related question to this discussion thread; I've crossposted to
comp.sys.m68k also.

In article <9961@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> I would say that, if the machine has a 64 bit address space and 64
> bit arithmetic, but the `natural' arithmetic size is 32 bits, it
> is 32 bits; otherwise it is 64.  I would make longs 64 (or perhaps
> even 128) bits, and shorts quite possibly still 16 bits.  It would
> be nice to have 8, 16, 32, and 64 bit values, and it seems natural
> to make those `char', `short', `int', and `long' respectively.  This
> might be a problem for program that assume either short==int or
> long==int, though.

What should be the compiler writer's criteria for selecting the "natural"
size for the above C types?  Should it be bus size, internal register size,
or something else.  Our 68000 compiler has 16 bit shorts, 32 bit longs (which
make sense) and 32 bit ints (which doesn't always make sense).

A lot of code I've come across uses scratch variables (array indices etc.) of
type int.  Of course, 32 bit arithmetic must be used.  However, the 68000 has
16 bit divide and multiply instructions, which are *much* faster than the 
subroutine calls to the 32 bit arithmetic routines.  The case could be made
that a 16 bit quantity is the "natural" size for arithmetic operations for
the 68000.

Why would a compiler vendor for the 68000 choose a 32 bit size for an int?

	Andrew			(..alberta!teletron!andrew)

hunt@spar.SPAR.SLB.COM (Neil Hunt) (01/02/88)

Summary: Justification for 32 bit ints on 68k machines.

In article <166@teletron.UUCP> andrew@teletron.UUCP (Andrew Scott) writes:
>
>[...] Our 68000 compiler has 16 bit shorts, 32 bit longs (which
>make sense) and 32 bit ints (which doesn't always make sense).
>
>A lot of code I've come across uses scratch variables (array indices etc.) of
>type int.  Of course, 32 bit arithmetic must be used.

Since the 68000 has 32 bit registers, there is frequently a penalty
on operations in 16 bits - what does the compiler do about the other
16 bits in the registers ? At least in the Sun compilers, it is very
hard to persuade the compiler not to put an extend `extl dn' instruction after
every load of a short variable into a register and a clear `moveq #0 dn'
instruction before each load of an unsigned short value into a register.

Another (perhaps less defendable) reason is that a lot of code
tends to be rather cavalier about exchanging pointers and ints,
(particularly in function return values, for example),
and a 16 bit int would break all of this code.

>However, the 68000 has
>16 bit divide and multiply instructions, which are *much* faster than the 
>subroutine calls to the 32 bit arithmetic routines.  The case could be made
>that a 16 bit quantity is the "natural" size for arithmetic operations for
>the 68000.

Indeed the 68000/8/10/12 have a 16x16->32 bit multiply instruction,
and a function is required for a long multiply. Note however that
in the case that the operands would have fitted into 16 bits, this
fact is quickly discovered and the short multiply is used instead:

		jsr	lmult	; 20

lmult:					; d0 and d1 are the operands.
		movl	d2,sp@- ; 14
		movl	d0,d2,	;  4
		orl	d1,d2	;  6	; OR all the bits together.
		clrw	d2	;  4	; mask bits 0..15, leaving 16..31.
		tstl	d2	;  4
		bnes	...	;  6	; if 16..31 are not zero, branch to ...
		mulu	d1,d0	; 40	; do the simple multiply
		movl	sp@+,d2	; 12
		rts		; 16

				 126 cycles

This is using 68010 timings, with some assumptions.

We see that, even counting the entire function call overhead, there is only
a factor of 3.1 between the function call and the use of the hardware
instruction directly. Things are perhaps not soo bad !

The sun compiler is also smart enough to recognise when a multiply
by a constant is possible in a 16 bit instruction, and uses it rather
than the function call in these cases.

Finally, the 68020 has three sizes of multiply instructions,
16x16->32, 32x32->32, and 32x32->64; On this machine there is little
penalty in having 32 bit ints, and the other advantages still apply.
A compiler writer aware that any 16/32 bit decision for ints would
apply across all 68k machines would probably not decide upon 16 bits
just because some of the machines are slightly slower on one instruction,
especially when all the machines would have to pay the penalty of
maintaining the high bits in the registers if 16 bits were the decision.

Neil/.

PS: Try using:
	a = (int)((short)x * (short)y);
if you really need that factor of 3 back in the multiply instruction --
On a Sun 2 this generates a `muls' instruction !

smvorkoetter@watmum.waterloo.edu (Stefan M. Vorkoetter) (01/03/88)

In article <461@m10ux.UUCP> rgr@m10ux.UUCP (Duke Robillard) writes:
>In article <163@bhjat.UUCP> bhj@bhjat.UUCP writes:
>>I hold that a short is defined as ALWAYS being 16 bits, and a long as
>>ALWAYS being 32 bits.....
>>So, on a 64-bit processor, what's an int?  For that matter, on machines larger
>>than 32 bits, what would short and long be?  
>According to our Cray programmers, on that 64-bit processor, a byte, a short,
>an int, and a long are all 64 bits.  kinda weird.  It must make something
>really fast....

I have heard that at least one Cray C compiler (is there more than one?)
converts all chars, ints, etc. to floats before performing math, and then
converts them back.  I hear that although the Cray is good at floats (esp.
many in a row), that the speed is terribly slow (about 5 x VAX 785).
These are all just things I heard, not necessarily based in fact.

Stefan Vorkoetter

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)

In article <163@bhjat.UUCP> bhj@bhjat.UUCP (Burt Janz) writes:
>In the "white book" on page 34, there is a brief discussion of "..."natural"
>size for a particular machine...".  I assume that means the internal register
>word size, not the bus transfer word size.

Not necessarily, although usually it's the register size (on a general
register architecture) or the memory word size (on most other common machines).
The implementor is free to make some really weird choice, so long as the
language constraints are met.

>I hold that a short is defined as ALWAYS being 16 bits, and a long as
>ALWAYS being 32 bits.

Since you have K&R, presumably you can check the table on p.34 to see
that it contradicts you.

>So, on a 64-bit processor, what's an int?

On our Crays, it's 64 bits, naturally.  (So's a short!)

I have heard of at least one 64-bit machine where int was 32 bits, or
at least that size was seriously considered.  I don't think that's in
keeping with the intent of "int", but it would be legal.

The size-related things you may portably rely on (except on perhaps a few
non-ANSI conforming implementations, typically "toy compilers") are:
	1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
	sizeof(float) <= sizeof(double) <= sizeof(long double)
	sizeof(void *) == sizeof(char *) >= sizeof(any_other_object_type *)
	BITS(char) >= 8		where BITS is assumed to be a macro that returns
				the number of bits in an object of that type
	BITS(short) >= 16
	BITS(int) >= 16
	BITS(long) >= 32
	BITS(float) >= 24
	BITS(double) >= 38
	BITS(long double) >= 38
	BITS(void *) >= 15
	BITS(char *) >= 15
	BITS(any_other_object_type *) >= 14
	"signed" and "unsigned" objects have the same size

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)

In article <1987Dec28.163403.24137@sq.uucp> msb@sq.UUCP (Mark Brader) writes:
>"long long" is allowed by the dpANS as a "common extension".

I don't think it is "allowed" in an ANSI-conforming implementation,
because "long long" violates a Constraint in section 3.5.2 and thus
requires a diagnostic.  However, it's hard to imagine this extension
breaking any strictly conforming program, so it would be a "safe"
extension to make.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)

In article <461@m10ux.UUCP> rgr@m10ux.UUCP (Duke Robillard) writes:
>According to our Cray programmers, on that 64-bit processor, a byte, a short,
>an int, and a long are all 64 bits.

I don't know what a "byte" is, but a Cray char is 8 bits.
Thus sizeof(short)==8 on the Cray (X-MP or Cray-2).

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/06/88)

In article <166@teletron.UUCP> andrew@teletron.UUCP (Andrew Scott) writes:
>Why would a compiler vendor for the 68000 choose a 32 bit size for an int?

To make it easier to port sloppily-written VAX code.

msb@sq.uucp (Mark Brader) (01/08/88)

I wrote:
> >"long long" is allowed by the dpANS as a "common extension".

Doug Gwyn (VLD/VMB) <gwyn> (gwyn@brl.arpa) replied:
> I don't think it is "allowed" in an ANSI-conforming implementation,
> because "long long" violates a Constraint ...

Agreed.  I should have said that it was *mentioned* as a "common extension"
(specifically, in section A.6.5.6); this might be interpreted as a hint that
if you are going to have such a type, that is a good name to call it.
Actually A.6.5.6 renders it as "long long int".   Anyway, the whole matter
of "common extensions" is not part of the Draft proper.

chip@ateng.UUCP (Chip Salzenberg) (01/09/88)

Andrew Scott asks:
>Why would a compiler vendor for the 68000 choose a 32 bit size for an int?

To which Doug Gwyn replies:
>To make it easier to port sloppily-written VAX code.

Yes -- but not all compiler writers have buckled under the great mass of
"AllTheWorldsAVax" programs.  The most popular C compilers available for
the Commodore Amiga (Manx and Lattice) have different sizes for "int"!
One is 16 bits (int==short) and the other is 32 bits (int==long).

I prefer 16 bits myself, but that's a religious issue, and far be it from
me to start an argument.  :-)
-- 
Chip Salzenberg                 UUCP: "{codas,uunet}!ateng!chip"
A T Engineering                 My employer's opinions are a trade secret.
    "Anything that works is better than anything that doesn't."  -- me

atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) (01/11/88)

In article <461@auvax.UUCP> rwa@auvax.UUCP (Ross Alexander) writes:
>I remember Tom Duff hacking on our GCOS/TSS implementation of B, long long
>ago, with exactly the same intent.  He created a whole whack of <op>$ things
>(i.e., +$, -$, *$, /$) which assumed the object was a float.  Not a nice
>language to actually use...
>
Actually, the operators are #+ #* #/ etc., and as I recall it was
Renaldo Braga that did it while he was writing the compiler.
The changes to the language were not a "hack".  They made floating
point numbers fit into the language in a manner consistent with
the rest of the language.

The point to realize is that B is a rather elegant small language
with a uniform view of objects and how they are handled.  The world
of B consists of a linear array of cells (words if you like) on which
operations such as addition, assignment, indirection, function call,
and transfer (i.e. GOTO). are performed.  This gives the programmer
considerable flexibility to express his algorithm, but like everything
else is open to abuse.  The language maps nicely onto a class of machine
archetecture, and is well suited to the class of programming tasks
that is usally considered "systems programming"  (utilities, compilers
editors etc).  The fact that it is typeless does give problems
when you try to move out of this area.  That is it doesn't carry
well to byte addressed machines, and is not suited to numerical
applications.  (The floating point is good enough to calculate some
things like percentage of a resource that has been used etc).

atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) (01/11/88)

In article <163@bhjat.UUCP> bhj@bhjat.UUCP (Burt Janz) writes:
>
>In the "white book" on page 34, there is a brief discussion of "..."natural"
>size for a particular machine...".  I assume that means the internal register
>word size, not the bus transfer word size.
>
>I hold that a short is defined as ALWAYS being 16 bits, and a long as
>ALWAYS being 32 bits.  I don't know if I'm right in this regard, but being
>stubborn, I always press the point of previous compiler/machine definitions.
>

You really don't have to look past the "white book" to realize this is wrong.
The Honeywell 6000 (aka DPS-8 DPS-88 DPS-90 DPS8000) uses 36 bits
for short, int and long.  This choice was implemented even before VAXes existed.
The machine just doesn't handle integers of any smaller size with any
ease at all.  

johnf@apollo.uucp (John Francis) (01/15/88)

>Why would a compiler vendor for the 68000 choose a 32 bit size for an int?

Consider:

    main()
        {
        char    c[100000],p1,p2;
        int     dp;

        p1 = &c[0];
        p2 = &c[99999];

        dp = p2 - p1;
        }

The result of subtracting two pointers is *defined* (K&R) to yield
a result of type int. Not long int - int!  Any implementation that
can not represent this in an int is not an implementation of C.

Mind you, any decent 68000 compiler should provide a 16-bit short int,
and the code generator should be able to handle a = b + c (all shorts)
without converting b & c to ints, adding them, and then converting the
result to short.  If it is really good it should be able to do the same
for a = b + 2.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/15/88)

In article <39aca826.7f32@apollo.uucp> johnf@apollo.uucp (John Francis) writes:
>The result of subtracting two pointers is *defined* (K&R) to yield
>a result of type int.

That's another of the things that ANSI C intends to fix.

alex@umbc3.UMD.EDU (Alex S. Crain) (01/16/88)

In article <7092@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <39aca826.7f32@apollo.uucp> johnf@apollo.uucp (John Francis) writes:
>>The result of subtracting two pointers is *defined* (K&R) to yield
>>a result of type int.
>
>That's another of the things that ANSI C intends to fix.


Fix? How pray tell would you fix it? make it machine dependent? or just
arbitrary? 

	Pick one:

	Pntr - Pntr -> int
	Pntr - Pntr -> long int
	Pntr - Pntr -> Pntr      (This looks interesting ....)
	Pntr - Pntr -> NewAnsiTypeOfDuboiusValueFirstImplementedByBorland
	Pntr - Pntr -> ????

-- 
					:alex.

alex@umbc3.umd.edu

gwyn@brl-smoke.UUCP (01/16/88)

In article <708@umbc3.UMD.EDU> alex@umbc3.UMD.EDU (Alex S. Crain) writes:
>Fix? How pray tell would you fix it? make it machine dependent? or just
>arbitrary? 

It should be obvious what to do if you think about it.

First, only pointers into the same object can meaningfully be subtracted.
Second, the result of the subtraction necessarily has implementation-
defined signed integral type; there is a typedef ptrdiff_t in <stddef.h>.

alex@umbc3.UUCP (01/17/88)

In article <7109@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <708@umbc3.UMD.EDU> alex@umbc3.UMD.EDU (Alex S. Crain) writes:
>>Fix? How pray tell would you fix it? make it machine dependent? or just
>>arbitrary? 
>
>It should be obvious what to do if you think about it.
>
>First, only pointers into the same object can meaningfully be subtracted.
>Second, the result of the subtraction necessarily has implementation-
>defined signed integral type; there is a typedef ptrdiff_t in <stddef.h>.


	Hmm..

	I don't really understand the answer, I think you said "impementation
dependant". If so, how is that a fix? when I'm calculating offsets in a lisp
implementation, and I decide that the fastest way is to use the actual 
addresses and reletively reference functions, and i say 

	funcall (ptr1 - ptr2); /* ptr1 - ptr2 > 64k */

what happens? or do I just check stddef.h for the answer?

	I would prefer 32bit (or rather, the largest default integer available
including pointers) ints, because conversion to short int is absolutely 
painless (mov.l becomes mov.w) while conversion the other way can be a rather
serious problem.


-- 
					:alex.

alex@umbc3.umd.edu

levy@ttrdc.UUCP (Daniel R. Levy) (01/17/88)

In article <7092@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
#> In article <39aca826.7f32@apollo.uucp> johnf@apollo.uucp (John Francis) writes:
#> >The result of subtracting two pointers is *defined* (K&R) to yield
#> >a result of type int.
#> 
#> That's another of the things that ANSI C intends to fix.

ok, so what will the result be?  long??
-- 
|------------Dan Levy------------|  Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
|         an Engihacker @        |  	<most AT&T machines>}!ttrdc!ttrda!levy
| AT&T Computer Systems Division |  Disclaimer?  Huh?  What disclaimer???
|--------Skokie, Illinois--------|

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/18/88)

In article <712@umbc3.UMD.EDU> alex@umbc3.UMD.EDU (Alex S. Crain) writes:
>	I don't really understand the answer, I think you said "impementation
>dependant". If so, how is that a fix?

It *is* implementation-dependent, whether or not you like it that way.
The "fix" is to quit insisting that ptrdiff_t has to be int and to permit
it to be long where necessary.

> when ... i say 
>	funcall (ptr1 - ptr2); /* ptr1 - ptr2 > 64k */
>what happens? or do I just check stddef.h for the answer?

What do you mean, what happens?  You pass an argument having some signed
integral type to funcall().  Presumably you have properly declared
funcall() as in
	extern void funcall( ptrdiff_t );

If you don't like using a type other than a known basic type, you can of
course coerce the ptrdiff_t into a long via a (long) cast, but seldom is
this necessary or useful.  (It is necessary if you're going to printf()
the value of a ptrdiff_t.)

chip@ateng.UUCP (Chip Salzenberg) (01/20/88)

In the beginning, gwyn@brl.arpa (Doug Gwyn) wrote:
>First, only pointers into the same object can meaningfully be subtracted.
>Second, the result of the subtraction necessarily has implementation-
>defined signed integral type; there is a typedef ptrdiff_t in <stddef.h>.

Then alex@umbc3.UMD.EDU (Alex S. Crain) asks what dpANS says about code like:
>	funcall (ptr1 - ptr2); /* ptr1 - ptr2 > 64k */

The expression "ptr1 - ptr2" evaluates to an expression of the type
ptrdiff_t.  If you did your homework, then you defined the parameter to
funcall as a ptrdiff_t, so your stack will line up neatly.

But Doug did not mention whether ptrdiff_t is a signed or unsigned type.
Or is this not specified in dpANS?  When will I stop asking rhetorical
questions? :-)

>	I would prefer 32bit (or rather, the largest default integer available
>including pointers) ints, because conversion to short int is absolutely 
>painless (mov.l becomes mov.w) while conversion the other way can be a rather
>serious problem.

Conversion from 32 bits to 16 bits is painless, but for many architectures,
manipulation of 32-bit quantities is awkward and/or slow.  Thus X3J11 is
avoiding (whenever possible) giving minimum sizes for standard types.

Repeat after me:  "Not all C programs run on 32-bit architectures."
-- 
Chip Salzenberg                 UUCP: "{codas,uunet}!ateng!chip"
A T Engineering                 My employer's opinions are a trade secret.
       "Anything that works is better than anything that doesn't."

pablo@polygen.uucp (Pablo Halpern) (01/20/88)

In article <708@umbc3.UMD.EDU> alex@umbc3.UMD.EDU (Alex S. Crain) writes:
>>>The result of subtracting two pointers is *defined* (K&R) to yield
>>>a result of type int.
>>
>>That's another of the things that ANSI C intends to fix.
>
>
>Fix? How pray tell would you fix it? make it machine dependent? or just
>arbitrary? 

The ANSI Draft that I have (Oct 1, 1986) describes the header file <stddef.h>
which defines two types.  As quoted from the document:

	The types are

		ptrdiff_t

	which is the signed integral type of the result of subtracting
	two pointers; and

		size_t

	which is the unsigned integral type of the result of the sizeof
	operator.

(end quote)
A machine with a short word but long pointer type (e.g. 68000, 8086) would
have definitions like

	typedef long		ptrdiff_t;
	typedef unsigned long	size_t;

in <stddef.h>.  Machines with short pointers (e.g. 8080, 6502) would
have definitions like

	typedef int		ptrdiff_t;
	typedef unsigned	size_t;

or better yet

	typedef long		ptrdiff_t;
	typedef unsigned	size_t;

The reason for the long ptrdiff_t definition is that the difference
between two unsigned, 16-bit pointers can excede the -32768 - +32767
range of a 16-bit two's compliment signed integer.  Machines with long
words and long pointers could use any of the above pairs of
declarations.

So the short answer to the question is: the difference of two pointers
has a type that is machine (and compiler) dependent.  But this is no
worse (or better) than making the definition of an int machine dependent.

Pablo Hapern,  Polygen Corp.  UUCP: {bu-cs,princeton}!polygen!pablo

john@frog.UUCP (John Woods, Software) (01/20/88)

In article <147@ateng.UUCP>, chip@ateng.UUCP (Chip Salzenberg) writes:
> Andrew Scott asks:
> >Why would a compiler vendor for the 68000 choose a 32 bit size for an int?
> To which Doug Gwyn replies:
> >To make it easier to port sloppily-written VAX code.
> Yes -- but not all compiler writers have buckled under the great mass of
> "AllTheWorldsAVax" programs.

Well, now that you mention it:  CRDS has always used 32 bits for an int on
the 68000, largely because (a) the 68020 was promised to be "real" 32 bits
when it came out, and (b) the rest of the system we were designing was 32
bits (it just had this bottleneck at the CPU chip).  (These, plus a desire
for marketing hype, I suppose).

The MC68000 is closer to being a 32 bit machine than the PDP-11 was, even if
it lacked niceties like 32 bit multiply or divide.

--
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101
...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu

"Cutting the space budget really restores my faith in humanity.  It
eliminates dreams, goals, and ideals and lets us get straight to the
business of hate, debauchery, and self-annihilation."
		-- Johnny Hart

alex@umbc3.UMD.EDU (Alex S. Crain) (01/20/88)

In article <155@ateng.UUCP> chip@ateng.UUCP (Chip Salzenberg) writes:
	[lots of stuff, and...]
>Repeat after me:  "Not all C programs run on 32-bit architectures."

	Correct. And I like the idea of a compiler that allows switchable
default int sizes, BUT...

	I still prefer 32 default ints to 16. Why? 
	1) because I do alot of stuff that wont fit in 16 bits.
	2) because of the way I write code, ie:
		a) block the project into little pieces.
		b) make each little piece work.
		c) rough tune the entire project.
		d) fine tune the little pieces.

There's more too it than that, but the order is about right. I've found
that no amount of preplaning will allow for the great idea that I get a 
4:45am 3 weeks into the project, and I don't want to be screwing around 
with size errors while I'm adding in what ever it is that I've forgotten.
Ie: I want the program to work now, it can work well later, if I have time.

	All in all, its a matter of preference. My code uses alot of
short ints, and thats fine with me. And I havn't heard any real good
reasons why ints should be 16 bits besides programmer preference. I've
gotten to the point where I prototype everything anyway, and I always
cruise through the final product looking for dead variables, etc, 
and scale everything down then. 

-- 
					:alex.

alex@umbc3.umd.edu

aglew@ccvaxa.UUCP (01/21/88)

>[gwyn@brl-smoke.ARPA]
>If you don't like using a type other than a known basic type, you can of
>course coerce the ptrdiff_t into a long via a (long) cast, but seldom is
>this necessary or useful.  (It is necessary if you're going to printf()
>the value of a ptrdiff_t.)

I suppose that there is no equivalent to %p for pointers for
ptrdiff_t?

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/21/88)

In article <155@ateng.UUCP> chip@ateng.UUCP (Chip Salzenberg) writes:
>In the beginning, gwyn@brl.arpa (Doug Gwyn) wrote:
>>Second, the result of the subtraction necessarily has implementation-
>>defined signed integral type; there is a typedef ptrdiff_t in <stddef.h>.
          ^^^^^^
>But Doug did not mention whether ptrdiff_t is a signed or unsigned type.

See ^^^^^^ above.

>Repeat after me:  "Not all C programs run on 32-bit architectures."

That's for sure.  Some of mine run on 64-bit architectures.

dsill@NSWC-OAS.arpa (Dave Sill) (01/22/88)

In article <155@ateng.UUCP> Chip Salzenburg <ateng!chip> writes:
>In the beginning, gwyn@brl.arpa (Doug Gwyn) wrote:
>>First, only pointers into the same object can meaningfully be subtracted.
>>Second, the result of the subtraction necessarily has implementation-
>>defined signed integral type; there is a typedef ptrdiff_t in <stddef.h>.
          ------
>But Doug did not mention whether ptrdiff_t is a signed or unsigned type.
>Or is this not specified in dpANS?  When will I stop asking rhetorical
>questions? :-)

Perhaps when you start reading the articles you quote. :-)

=========
The opinions expressed above are mine.

"I shed, therefore, I am."
					-- ALF

gene@cooper.cooper.EDU (Gene (the Spook)) (01/23/88)

in article <7092@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) says:
> 
> In article <39aca826.7f32@apollo.uucp> johnf@apollo.uucp (John Francis) writes:
>>The result of subtracting two pointers is *defined* (K&R) to yield
>>a result of type int.
 
> That's another of the things that ANSI C intends to fix.

"Fix"? "FIX"???? Please! ANSI's done enough already. Why don't they just
leave it alone? After all, what else would it be?

Take the example of an offset from a pointer. If, for example, you have

		p1 = p0[d]

This would be interpreted as

		(ptr) p1 = (ptr) p0 + (int) d

Simply, add an integer displacement to a pointer, and you'll get another
pointer. With a little algebra, solve for 'd'. You'll get

		(int) d = (ptr) p1 - (ptr) p0

This is simply solving for the displacement. What else could it be???
I think the K&R solution is just fine the way it is. How in the world
could ANSI "fix" it? Can someone please explain that to me? Really, I
would appreciate an email'ed response, since I don't often get to read
the news. Thanx in advance.

					Spookfully yours,
					Gene

					...!ihnp4!philabs!phri!cooper!gene


	"If you think I'll sit around as the world goes by,
	 You're thinkin' like a fool 'cause it's case of do or die.
	 Out there is a fortune waitin' to be had.
	 You think I'll let it go? You're mad!
	 You got another thing comin'!"

			- Robert John Aurthur Halford

news@ism780c.UUCP (News system) (01/28/88)

In article <7159@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <28700025@ccvaxa> aglew@ccvaxa.UUCP writes:
>>I suppose that there is no equivalent to %p for pointers for
>>ptrdiff_t?
>
>Why should there be?  printf( "%ld", (long)( - p1) );

The problem with a header file defining a type ptrdiff_t is that the the
defined type must be one of the types in the base C language.  If an
implementation defines ptrdiff_t such that sizeof(ptrdiff_t) == sizeof(void*)
then p1-p2 may overflow.

BTW, the definition of pointer difference in K&R is "if p and q point to
members of the same array , p-q is the number of elements between p and q."

One way to interpret this is the following:

     el1  el2  el3
      ^         ^
      |         |
      p         q

As can be seen from the diagram the number of elements "between" p and q is
1 (not 2).  Furthermore, the number of elements between p and q is clearly
the same as the number of elements between q and p.  i.e, p-q == q-p and no
overflow is possible.  I hope the defininition in the proposed standard is
less ambigious than the wording in K&R.

       Marv Rubinstein -- Interactive Systems

nevin1@ihlpf.ATT.COM (00704a-Liber) (01/28/88)

In article <1186@cooper.cooper.EDU> gene@cooper.cooper.EDU (Gene (the Spook)) writes:
.Take the example of an offset from a pointer. If, for example, you have
.
.		p1 = p0[d]
.
.This would be interpreted as
.
.		(ptr) p1 = (ptr) p0 + (int) d

Actually, I thought that 'p1 = p0[d]' is interpreted as

	(ptr) p1 = (ptr) p0 +  (long) (d * sizeof(type p0))

assuming long can hold all possible additions to pointers.

.Simply, add an integer displacement to a pointer, and you'll get another
.pointer. With a little algebra, solve for 'd'. You'll get
.
.		(int) d = (ptr) p1 - (ptr) p0
.
.This is simply solving for the displacement. What else could it be???

FIrst off, d should be a long, not an int.  A possibility for what d should be
is the value that makes p0[d] === p1.  (I am not suggesting that this happen)

.I think the K&R solution is just fine the way it is.

100% agreement here.
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

dag@chinet.UUCP (Daniel A. Glasser) (01/29/88)

In article <8728@ism780c.UUCP> marv@ism780.UUCP (Marvin Rubenstein) writes:
>
[stuff deleted]
>
>BTW, the definition of pointer difference in K&R is "if p and q point to
>members of the same array , p-q is the number of elements between p and q."
>
>One way to interpret this is the following:
>
>     el1  el2  el3
>      ^         ^
>      |         |
>      p         q
>
>As can be seen from the diagram the number of elements "between" p and q is
>1 (not 2).  Furthermore, the number of elements between p and q is clearly
>the same as the number of elements between q and p.  i.e, p-q == q-p and no
>overflow is possible.  I hope the defininition in the proposed standard is
>less ambigious than the wording in K&R.
>
I think you misinterpret the meaning of where things point to.

The correct diagram for this would be

    +-----+-----+-----+-----	You see, this is not ambiguous at
    | el1 | el2 | el3 | ...	all -- there are exactly two elements
    +-----+-----+-----+-----	between the addresses pointed to by
    ^           ^          	p and q.
    |           |
    p           q

This is a common mistake for novice programmers and HLL programmers
who are not familiar with pointers to make.  Pointers genereally
contain the address where the "pointed to" item begins, not the
item itself.  When I was tutoring in college, one of the professors
overdid the "mailbox" analogy, and a number of students believed that
the computer must have different sized "boxes" for different kinds
of data...  So integer boxes were addressed the same as float boxes,
just in a different part of memory.  "I have a program that the Fortran
compiler says has too much integer data.  I'm not using any reals, so can
I use the real storage for integers?" (I actually got that question!)

I find that Basic programmers make this error most often.

-- 
Nobody at the place where I work	Daniel A. Glasser
knows anything about my opinions	...!ihnp4!chinet!dag
my postings, or me for that matter!	...!ihnp4!mwc!dag
					...!ihnp4!mwc!gorgon!dag
	One of those things that goes "BUMP!!! (ouch!)" in the night.

am@cl.cam.ac.uk (Alan Mycroft) (01/29/88)

In article <7159@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>[gwyn@brl-smoke.ARPA]
>If you don't like using a type other than a known basic type, you can of
>course coerce the ptrdiff_t into a long via a (long) cast, but seldom is
>this necessary or useful.  (It is necessary if you're going to printf()
>the value of a ptrdiff_t.)
>>I suppose that there is no equivalent to %p for pointers for
>>ptrdiff_t?
>
>Why should there be?  printf( "%ld", (long)(p2 - p1) );

To reopen this discussion, I agree that Gwyn's code works, but it
essentially lies in the deprecated category in that a future ANSI-C
standard could add extra types, e.g. long long int, and on implementations
with ptrdiff_t = long long int; then (long)(p2-p1) would fail.
The problem is that there is no portable way to to pass ptrdiff_t, clock_t
to printf().

Even more problematic:  has anyone on the ANSI committee tried
writing a STRICTLY CONFORMING program which accurately prints cpu time
(a la clock()) to a file (say in centi-secs or to 2 decimal places)?
Perhaps printf("%t", (time_t)...)?
The trouble is that time_t may be either integral or floating.
Thus code like    printf("%ld", (long)(clock()*100/CLK_TCK));
MAY work, or the *100 may overflow.  Now there is no guarantee
that using double arithmetic will work either:  time_t may be
more precise (32bits) than a f.p mantissa.
Thus  printf("%f", (double)clock() * 100.0 / CLK_TCK);
is probably the best that one can do.  I still feel uneasy about the
gratuitous use of floating point there though.

I concede that someone may think of a very good reason as to why
time_t may be floating (the draft says arithmetic) and anyway
it is asctime()'s job to decode it.  However I think
that clock_t could be sensible restricted to be integral
since its job is to *count* CLK_TCK's anyway.

Views?

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/30/88)

In article <8728@ism780c.UUCP> marv@ism780.UUCP (Marvin Rubenstein) writes:
-The problem with a header file defining a type ptrdiff_t is that the the
-defined type must be one of the types in the base C language.  If an
-implementation defines ptrdiff_t such that sizeof(ptrdiff_t) == sizeof(void*)
-then p1-p2 may overflow.

"As with any other arithmetic overflow, if the result does not fit in the
space provided, the behavior is undefined."

It is up to the implementation to choose a reasonable size for ptrdiff_t.

-BTW, the definition of pointer difference in K&R is "if p and q point to
-members of the same array , p-q is the number of elements between p and q."
-One way to interpret this is the following:

Yes, that's been fixed in the proposed ANSI C standard.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/30/88)

In article <569@tuvie> rcvie@tuvie.UUCP (Alcatel-ELIN Forsch.z.) writes:
>Besides, subtraction of pointers should work for any two pointers, ...

Not feasible for segmented architectures.

richard@aiva.ed.ac.uk (Richard Tobin) (01/31/88)

In article <1131@jenny.cl.cam.ac.uk> am@cl.cam.ac.uk (Alan Mycroft) writes:
>Even more problematic:  has anyone on the ANSI committee tried
>writing a STRICTLY CONFORMING program which accurately prints cpu time
>(a la clock()) to a file (say in centi-secs or to 2 decimal places)?
>Perhaps printf("%t", (time_t)...)?
>The trouble is that time_t may be either integral or floating.

Hmmm... I haven't got a standard handy, so maybe I'm missing something,
but how about:
	
	int time_is_floating;
	time_t t;

	time_is_floating = (t = 1.5, t > 1);

	if(time_is_floating)
	    ....

Ok, so it's a hack.

-- 
Richard Tobin,                         JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,             ARPA:  R.Tobin%uk.ac.ed@nss.cs.ucl.ac.uk
Edinburgh University.                  UUCP:  ...!ukc!ed.ac.uk!R.Tobin

news@ism780c.UUCP (News system) (02/02/88)

In article <7199@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <8728@ism780c.UUCP> marv@ism780.UUCP (Marvin Rubenstein) writes:
>-The problem with a header file defining a type ptrdiff_t is that the the
>-defined type must be one of the types in the base C language.  If an
>-implementation defines ptrdiff_t such that sizeof(ptrdiff_t) == sizeof(void*)
>-then p1-p2 may overflow.
>
>"As with any other arithmetic overflow, if the result does not fit in the
>space provided, the behavior is undefined."
>
>It is up to the implementation to choose a reasonable size for ptrdiff_t.

The problem I (Marv Rubinstein) was trying to address is that soon (hopfully
before we discard C) there will be machines with 48 bit (or longer) pointers
and 32 bit arithmetic registers.  On such a machine character arrays with
more than 2**31 elements will be reasonable.  In order to be conformining
'longs' would have to be defined to contain more than 32  bits to hold
(reasonable) pointer differences, while other operations on longs would need
only 32 bits.  That is why I think a more general solution would be to have
the type ptrdiff_t built into the language.  Note I am not suggesting a
change in the language, I am only suggesting that the the proposed solution
is only marginally better than K&R's pointer difference is an int.

     Marv Rubinstein

karl@haddock.ISC.COM (Karl Heuer) (02/03/88)

In article <8817@ism780c.UUCP> marv@ism780.UUCP (Marvin Rubenstein) writes:
>The problem I (Marv Rubinstein) was trying to address is that soon (hopfully
>before we discard C) there will be machines with 48 bit (or longer) pointers
>and 32 bit arithmetic registers.

I also noted this problem.  Apparently an implementation can still be
conforming if it has additional types (using, say, `__verylong int' rather
than `long long int', so as to avoid violating a constraint).  The type of
ptrdiff_t must be a signed arithmetic type.  Must it be one of the standard
arithmetic types, or is it permissible for it to be larger than a long int?
I'm not sure what the dpANS says about this.

Btw, one solution for the printf-format problem is to require something like
  #define FORMAT_TIME_T     "%ld"
  #define FORMAT_PTRDIFF_T  "%d"
in some header file; string pasting makes this feasible.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

meissner@xyzzy.UUCP (Usenet Administration) (02/03/88)

In article <8817@ism780c.UUCP> marv@ism780.UUCP (Marvin Rubenstein) writes:
| The problem I (Marv Rubinstein) was trying to address is that soon (hopfully
| before we discard C) there will be machines with 48 bit (or longer) pointers
| and 32 bit arithmetic registers.  On such a machine character arrays with
| more than 2**31 elements will be reasonable.  In order to be conformining
| 'longs' would have to be defined to contain more than 32  bits to hold
| (reasonable) pointer differences, while other operations on longs would need
| only 32 bits.  That is why I think a more general solution would be to have
| the type ptrdiff_t built into the language.  Note I am not suggesting a
| change in the language, I am only suggesting that the the proposed solution
| is only marginally better than K&R's pointer difference is an int.

Shades of the 80286.  Since the difference of two pointers is only meaningful
(in C) between two elements of the same array, how can you access array
elements if an implementation allows a bigger array dimension than it can
represent with an integral type?  All sorts of things would break.
-- 
Michael Meissner, Data General.		Uucp: ...!mcnc!rti!xyzzy!meissner
					Arpa/Csnet:  meissner@dg-rtp.DG.COM

gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/08/88)

In article <1131@jenny.cl.cam.ac.uk> am@cl.cam.ac.uk (Alan Mycroft) writes:
-I concede that someone may think of a very good reason as to why
-time_t may be floating (the draft says arithmetic) and anyway
-it is asctime()'s job to decode it.  However I think
-that clock_t could be sensible restricted to be integral
-since its job is to *count* CLK_TCK's anyway.

I seem to recall that somebody pointed out that an integer would
overflow in an unduly small amount of real time on a system with
a high-resolution system clock.